Maintained Relationships on Facebook

This past week the Economist published a piece entitled Primates On Facebook that described some research done by the Facebook Data Team. Since there have been a number of questions throughout the monkeysphere, we thought we would take the opportunity to describe our approach, the data, and our analysis.

network-comparison

We were asked a simple question: is Facebook increasing the size of people’s personal networks? This is a particularly difficult question to answer, so as a first attempt we looked into the types of relationships people do maintain, and the relative size of these groups. The image above presents a high-level overview of our findings: while the average Facebook user communicates with a small subset of their entire friend network, they maintain relationships with a group two times the size of this core. This not only affects each user, but also has systemic effects that may explain why things spread so quickly on Facebook.

Before discussing the data, let us first set the context.

People you know

Many people are asking questions about the number of friends they have on Facebook. Do I have enough? Do I have too many? What may be tripping people up here is the language: while the people you’re connected to on Facebook are called your “friends,” they’re more likely people you have met at some point in your life. Social network researchers have been trying to measure this number for decades, and come across a number of clever techniques.

If you’ve read the Tipping Point, you may remember a study Gladwell described where people were asked to identify whether or not they knew people with names from a long list culled from a phone book. Based on the probability of knowing someone with a given name and the number of people with this name that a person knows, we can estimate the number of people a given subject has met. Killworth, et al. found using this technique and others that the number of people a person will know in their lifetime ranges somewhere between 300 and 3000[1].

On Facebook, the average number of friends that a person has is currently 120[2]. Given that Facebook has only been around for 5 years, that not everyone uses it, and that the not every acquaintance has found each other, this number seems reasonable for an average user.

Communication network

As a subset of the people you know, there are some individuals with whom you communicate on an ongoing basis. The number of individuals that represent a person’s core support network has been found to be much, much smaller than their entire network. Peter Marsden found the number of people with whom individuals “can discuss important matters” numbers only 3 for Americans[3]. In a subsequent survey, researchers found that this number has dropped slightly over the past 10 years[4], causing some alarm in the press, but without sufficient explanation[5].

How many people an individual communicates with probably exists somewhere between their total network size and their support network. Some research by Gueorgi Kossinets and Duncan Watts observing all email communication at a university shows that the number of ongoing contacts hovers somewhere between 10 and 20 over a 30 day period[6].

Maintained Relationships

Facebook and other social media allow for a type of communication that is somewhat less taxing than direct communication. Technologies like News Feed and RSS readers allow people to consume content from their friends and stay in touch with the content that is being shared. This consumption is still a form of relationship management as it feeds back into other forms of communication in the future. For instance, a high school friend uploads a photo of her new puppy and this photo appears in your News Feed. You click on the photo, browse through a host of other photos and discover that she has also gotten engaged, which may lead you to reach out to her.

This type of communication is the core of the Facebook experience, and given the question posed by the Economist, we wondered what effect this sort of relationship maintenance had on the breadth of people’s networks.

Measuring Networks on Facebook

To try and answer questions about network size on Facebook, we looked at the communications of a random sample of users over the course of 30 days. We defined networks in 4 different ways:

  • All Friends: the largest representation of a person’s network is the set of all people they have verified as friends.
  • Reciprocal Communication: as a measure of a sort of core network, we counted the number of people with whom a person had had reciprocal communications, or an active exchange of information between two parties.
  • One-way Communication: the total set of people with whom a person has communicated.
  • Maintained Relationships: to measure engagement, we took the set of people for whom a user had clicked on a News Feed story or visited their profile more than twice.

For each users we calculated the size of their reciprocal network, one-way network and network of maintained relationships, and plotted this as a function of the number of friends a user has. As Andreas mentions in his blog post about the article, the visualization (shown below) did not make it into the article, but presents a pretty clear picture of the relationship between these types of communication.

active-network-size

In the diagram, the red line shows the number of reciprocal relationships, the green line shows the one-way relationships, and the blue line shows the passive relationships as a function of your network size. This graph shows the same data as the first graph, only combined for both genders. What it shows is that, as a function of the people a Facebook user actively communicate with, you are passively engaging with between 2 and 2.5 times more people in their network. I’m sure many people have had this feeling, but these data make this effect more transparent.

Systemic Effects

What effect does a 2x increase in connectivity mean for a network? The easiest way to observe this is to look at one person’s personal network. The image below shows the personal network for one of my coworkers. The first diagram shows his entire network, namely all of his friends, and all of the relationships between his friends. It is clear that the cluster on the top is the highly connected set of Facebook coworkers, and the cluster on the right is another group of friends.

asmith-connections

The cell on the bottom right shows only those relationships that have reciprocal communication. Many of the individuals in his network are completely disconnected or out of touch with each other. Moving to the bottom left cell, we see the slightly more connected network containing one-way communication. This includes every person who wrote a comment, sent a message or wrote a wall post to one of my coworker’s other friends. The cell on the top-right shows the passive network, including all those people who were keeping up with their friends. While some of his friends are still disconnected, a very large percentage are now reachable through some set of observations.

The stark contrast between reciprocal and passive networks shows the effect of technologies such as News Feed. If these people were required to talk on the phone to each other, we might see something like the reciprocal network, where everyone is connected to a small number of individuals. Moving to an environment where everyone is passively engaged with each other, some event, such as a new baby or engagement can propagate very quickly through this highly connected network.

While these data are not a controlled experiment, and do not directly relate to the theories described above, they do show a directional trend in the way people manage relationships on a social network today. We hope to continue this line of research with the eventual hope of making relationships that much easier to manage.

This post represents the work of data scientists Lee Byron, Tom Lento, Cameron Marlow, Itamar Rosenn. Special thanks to Alex Smith for letting us use him as an example. For more insights like this, make sure to become a fan of the Facebook Data Team.

Footnotes

  1. Killworth, P., Johnsen, E., Russell, H. B., Shelley, G. A., and McCarty, C. Estimating the size of personal networks. Social Networks 12 (1990), 289–312. []
  2. Facebook Statistics []
  3. Marsden, P. Core discussion networks of americans. American Sociological Review 52, 1 (1987), 122–131. []
  4. Mcpherson, Miller, Smith-Lovin, Lynn, Brashears, and Matthew, E. Social isolation in america: Changes in core discussion networks over two decades. American Sociological Review 71, 3 (June 2006), 353–375. []
  5. While this work is well cited, there is support that the methodology underestimates the core network, e.g. Bearman, P., and Parigi, P. Cloning Headless Frogs and Other Important Matters: Conversation Topics and Network Structure. Social Forces 83 (2004), 535. []
  6. Kossinets, G., and Watts, D. J. Empirical analysis of an evolving social network. Science 311, 5757 (January 2006), 88–90. []

Some overdue updates

I have the unhealthy expectation that Facebook = Reality, and sometimes I forget that not everyone has a news feed (or reads theirs every day). In case you missed them, here are some recent events in my life:

  • I got engaged to my lovely girlfriend
  • I started a new job as a research scientist at Facebook
  • I found out that my cholesterol is astronomically high

These are, of course, in some order of importance. More on each of these in due time, but I should state a few caveats to quell the fidgeting audience:

  • We do not have a date yet, we’re currently in engagement-celebration mode
  • My new job is roughly like my old job, except at Facebook instead of Yahoo!
  • Thanks to exercise and some dietary changes, my cholesterol seems to be in check for the time being

Happy New Year! See you in 2008.

Facebook opens registration

facebook logoFacebook has recently been making big changes, such as offering APIs and experimenting with privacy. Some of these changes have been met with positive feedback, and others with hostility, but it is obvious from these recent experiments that they are testing new waters. Probably the biggest change they have proposed though is opening registration to anyone interested in joining (Techcrunch coverage here). Facebook’s message to users makes is sound as though they providing a needed service, but I think their intentions are clear: they want to beat MySpace, and they aren’t going to wait for long.

As with any massively engaged social system it’s extremely hard to predict how the entire community will collectively react to a decision like open registration. In order to think about how this change might affect adoption and usage, let me first introduce a two unique qualities of their current system.

Fresh networks: College students have a unique need for networking software. When a freshman arrives at school, they have few friends, and an overwhelming number of people to interact with. Somehow every year, hundreds of thousands of freshmen figure things out and new networks arise. Facebook provides a service to these newcomers, allowing them to search and locate people with similar tastes in a much more efficient manner.

Natural privacy: The first security model employed by Facebook was extremely restrictive, allowing only those individuals at a given school to see others within the same domain. However, this boundry sits at a natural location: schools are communities with extremely strong ingroup affiliation, and growing or shrinking this boundary does not make the group any more cohesive. Schools have formal systems for dealing with problems that might arise from students, taking the load off of Facebook.

Both of these properties are changing with open registration. First, people signing up from outside a college will not be in the position of looking for an entirely new network of friends. This means growth will be much slower, and will not reach the saturation rates that Facebook sees among college users. Instead of having nearly 100% of college students, they will be selecting for users who have certain demographic profiles.

Second, privacy will no longer be as simple as being in the same email domain as your friends. The site has a host of new privacy features, such as specifying the level of visibility of your profile to each friend. The complexity introduced by this lack of natural boundaries will make it harder for the system to match users’ real lives. Those students that used the system because it was easy might rethink their decision.

Third, the boundaries that created strong ingroup affiliation will no longer be relevant. Even though privacy boundaries will still exist, because users will have more friends from the outside, the distinction between “my college” and the outside world will not be as relevant. Not considered a college tool by users, it might very well stop being used as such.

To restate, it’s hard to predict how massive social systems will change with the introduction of new members, but opening registration to the masses will certainly introduce some sort of catalyst into the system. They were smart to wait until this year’s incoming class had adopted the tool, but we may very well see a different reaction from new students next year.

Privacy and transparency

Recently Facebook has introduced a few feature which has raised a lot of attention among users and bloggers. This piece of the system, called the News Feed shows you activity of your contacts within Facebook. If your friend posts to their blog, uploads a photo, attends an event or changes almost anything in their profile, these show up in your news feed. Many of the users have voiced complaints, saying that this is an infringement of their privacy, namely exposing their activities in a way that makes it easy for people to track their behavior.

Why would Facebook implement such a feature? My guess is that they hope this new level of transparency will cause people to be more active. For instance, if I see that my friend is attending an event, I might choose to join them and attend as well. This type of event log is not new; Upcoming has had a similar feature for many months now. In both cases these logs detail events happening around the system that could be observed otherwise, but in a form that is much more easily consumed. Why do Facebook users care? The system has now made peoples actions too transparent, in a way that would limit their ability to express themselves without fear of their privacy being breached. In fact, the result of this new feature may have the opposite effect than they anticipated: users might start censoring their actions in order to avoid being noticed.

As another example, take the political campain contribution website Fundrace.org. Thanks to campaign financing laws, all contributions of over $200 to a political campaign or party must be made public. These data are collected by the Federal Election Commission (FEC) and made available in electronic form for download. Eyebeam researchers took the files provided by the FEC, indexed them, and made a search engine available to the web. Now anyone could easily find contributers by their last name or address.

One result of this simple transformation was that campaign contributers thought their privacy had been breached. Even though their contributions are required to be public, that does not mean that they are required to be indexed so that people can be easily found. In this case Eyebeam made the FEC data more transparent, and as a result, those who contributed felt betrayed.

In developing social software, there is an inherent tradeoff between transparency and privacy; finding the correct balance is a demanding task, and one that needs to be carefully user-tested. While the overall benefit to a system might be positive, some features will cause some users to be angry, while others will result in serious privacy infringements. While the benefits to tranparency can be huge, users must feel safe and protected at all times, and the transition from comfort to discomfort can happen in a matter of seconds.

Online communities just a few years ago were mainly opaque about user activities, most probably in protection of EULAs and privacy advocacy. Nearly every Web 2.0 company alerts you when people have affected your account, such as somone adding you as a contact in Flickr or Myspace. It’s only now that we’re pushing up against this line of transparency, and I would expect in the next few years a set of best practices will evolve as to what features are admissable and which are not.