Global Information Networks:- a talk by Jon Kleinberg at the CCC Library of Congress Symposium 2009

SPECIAL FEATURE Library of Congress Logo

Computing Research that Changed the World: Reflections and Perspectives

March 25, 2009 | 8:45 am - 5:00 pm | Members' Room, Thomas Jefferson Building, Library of Congress

Global Information Networks

JON KLEINBERG - Cornell University Slides - 768 KB Download - 340 MB Watch the Talk (20:11)

Global Information Networks Over the past decade, ideas from computer science have offered new ways of thinking about the social networks and social processes that link people together. Rapidly proliferating on-line networks such as e-mail, instant messaging, social networking sites, blogs, and the World Wide Web have increasingly blurred the distinction between social and technological networks. They have provided individuals unprecedented opportunities for maintaining existing social ties, and for establishing new ties that cross traditional geographic, social and cultural boundaries. They allow individuals from various backgrounds, with diverse intentions or qualifications, to participate in public discussion. At the same time, these new media create records of interactions and activities, providing researchers from the social sciences, from computer science, and from other disciplines with an unprecedented wealth of data on human interactions. They provide the opportunity for major advances in the social sciences, using ideas from computation.

As a result, we have seen an increasing confluence of social sciences and computer science. Among the questions falling within the purview of computer science are these: Can we detect patterns of action and interaction from large traces of human behavior? Can we build mathematical models to help explain these patterns? And what do these models tell us about the way systems should be designed for large populations of users?

We can extract patterns from very basic human interactions. For example, we consider the density of locations for which pictures have been uploaded to the popular photo-sharing site Flickr. Not only can we extract a workable map, with main roads, coastlines and a good idea of population densities; we can correctly infer labels such as city names or the names of monuments and other sightseeing destinations. Using methods from computer vision, we can select "characteristic" views for these locations.

As a second example, we can analyze text on blogs and news sites over a period of time to find individual words or phrases that appear particularly frequently. Without any supervision or domain knowledge, we recover a surprisingly accurate timeline of important political developments. The source for this timeline is nothing but a trace of the behavior of large numbers of individuals, in this case, bloggers and news writers; when analyzing their collective behavior, we see the emergence of patterns that teach us about the time-varying interests of society as a whole.

As a third example, we can examine how a computational perspective has led to new inter¬pretations of a classic experiment in social psychology, and in turn has suggested new ideas for the organization of both social networks and computer networks. In the 1960's, Stanley Milgram and colleagues aimed to understand the path lengths in the network of friendships in the United States. Individuals in Nebraska were asked to forward a letter to a friend of Milgram's near Boston. The letter could only be forwarded to one person, and only to a friend of the current holder of the letter. The completed chains have surprisingly small lengths, with a median length of 6 hops. From this experiment, the phrase "6 degrees of separation" entered pop culture. Recently, Leskovec et al. analyzed the graph of Microsoft Instant Messenger contacts, and found similar distributions of distances on a much larger network.

The choice made by Milgram to have individuals forward the letter only to one person at a time raises an interesting set of questions: Even if these short paths exist in social networks, how would individuals with their myopic view of the network be able to find them? What are the clues by which individuals make their choices of to whom to forward the letter? What properties must the friendship network exhibit for such simple choices to be successful?

This type of question is a significant departure from a purely static analysis of social networks. We are not asking "what?" but "how?" - an inherently algorithmic question. An important contribution of computer science is to look at the world in algorithmic terms, and ask algorithmic questions. Indeed, Milgram asked individuals about their decision-making process, and found that the two most important criteria were geographic distance and similarity in professions. That raises the following concrete question: What does it tell us about a social network if myopic forwarding based solely on geographic or social distance succeeds in identifying short chains?

An analysis of simple mathematical models for social networks embedded in physical space suggests a qualitative answer to this question: It takes the right balance of links at different distance scales. Roughly speaking, to enable successful forwarding, each individual should have about the same number of friends at distances between 1-10, 10-100, 100-1000, 1000-10000, and so on. In other words, even though there are many more people at larger distances, each distance scale occupies roughly the same attention for each individual. The forwarding chain looks much like the routing in explicitly designed systems such as the US Postal System: a letter is first forwarded to the right country, then the right city, right block, right building, and finally the correct office. Yet the USPS routing system is carefully designed and maintained at great cost, whereas human friendship patterns have evolved naturally.

The examples illustrate two complementary contributions of computer science to the social sciences: the design of systems at the boundary of social and technological networks, and the modeling of social systems in light of their algorithmic behavior. Both shed new light on fundamental questions in the social sciences, but they also raise many new questions and concerns. Why do social processes produce the outcomes they do? Communication technology itself is changing the way in which people interact - how do the patterns of interaction change as a result of technological innovation? Will geographic location lose its high correlation with friendships in light of online social networks? How can we protect the privacy of individuals? Do we actually want computer systems that can learn deep patterns in our behavior from raw traces? The most fundamental question - and one to which computer science is poised to make significant contributions - is how much understanding of ourselves as human beings we can gain with the help of these new scientific developments.

Footnote: Barbara Liskov was named the 2008 ACM A. M. Turing Award Winner; the award, named for British mathematician Alan M. Turing, is widely considered to be the "Nobel Prize in Computing."

This is an archived version of CCC's website. Please visit the new ccc website for the latest information.

Computing Research that Changed the World: Reflections and Perspectives

Global Information Networks