Why We are able to Google:- a talk by Al Spector at the CCC Library of Congress Symposium 2009

SPECIAL FEATURE Library of Congress Logo

Computing Research that Changed the World: Reflections and Perspectives

March 25, 2009 | 8:45 am - 5:00 pm | Members' Room, Thomas Jefferson Building, Library of Congress

Why We're Able to Google

AL SPECTOR - Google Inc. Slides - 1.1 MB Download - 181 MB Watch the Talk (19:41)

why we must google The technology underlying the World Wide Web is based on decades of research, both federally sponsored and industry-based, in a diversity of subfields of computer science, ranging from computer algorithms, to computer architecture and networking, to distributed systems, to information retrieval. Many important research ideas have led to the web, "cloud computing," pervasive search, and other capabilities that we now take for granted. And, technical challenges still abound, providing a fertile ground for further advances.

On the Internet one can usually type something into a personal computer or phone and get useful results very quickly - this has created a societal change and allows amateurs to get answers without having to travel to special collections or have special knowledge and training. Many more services are now available, e.g., libraries, videos, and images are all available online. Via machine translation between 40 language pairs, one can type a search request in one language and get answers on what is being written in another language and culture. This capability is based on decades of research in such areas as machine translation and machine learning.

The Internet allows information to flow back and forth across billions of devices: computers, mobile phones, sensor devices. What is the architecture of the "cloud" of hardware and software? Computers called servers connected via the Internet serve the request of billions of people reliably; servers are grouped together in very large clusters of a hundred thousand or more. There are three categories of software that, working together, constitute the so-called Cloud that provides users with services - computer science has had significant impact on all three. The operating system, for example, Unix or Microsoft Windows, is the technology at the lowest level that controls all the devices and provides the ability to handle many tasks simultaneously and recover from errors. On top of the operating system layer is the Distributed Computing Infrastructure layer whose job is to make the large number of individual computers in many clusters behave as if the collection were a single, coherent "multi-computer," able to aggregate information despite having it distributed across an arbitrary number of computers. On top of that layer are all the applications, e.g., Google Maps and Google Search. The Cloud needs to get the right information to the user quickly via a variety of devices, and operate reliably and at a truly global scale that was unimaginable even a decade ago.

What R&D had to come together to make all this possible? Progress in programming and programming languages has been essential. Even more importantly, computers are no longer special-purpose, closed environments; both corporate and "open source" communities can now build software components that can work together for as yet unanticipated uses. Having the huge numbers of components in the network work in the aggregate reliably and with appropriate security is an ongoing, hugely important problem area. Distributed systems research helps provide techniques for how to create the illusion of a single coherent environment. With the advent of the World Wide Web, over five decades of research on how to structure, hyperlink, and retrieve information came to fruition. And making all that information accessible via user-friendly technologies developed by the computer science community is singularly important. These technologies include the mouse and the graphical point-and-click user interface, speech recognition, language translation, and content that is not restricted to just text but is multi-media, including graphics, audio and video, etc. To take one important example, research starting in the 1960's on how to define relevancy of retrieved information practically as well as mathematically fed into the "page-rank" algorithm used by Google and other search engines to yield useful results typically in the first page of results.

Looking at the recipients of the ACM A.M. Turing Award, the equivalent of the Nobel Prize in our field, reveals a ground-breaking set of technologies that have proven seminal to creating the world of online information that we enjoy today. Some of these advances came from academe, some from industry, and some from collaborations between these sectors. The same is true for the ACM Software Systems Award. The field of computer science continues to be wide open, with tremendous opportunities for innovation. We have unbridled optimism about what our field will achieve in the future. The fluidity encouraged by our country's startup mentality, coupled with research investments, has been and will continue to be instrumental to progress and indeed to our economic prosperity.

This is an archived version of CCC's website. Please visit the new ccc website for the latest information.

Computing Research that Changed the World: Reflections and Perspectives

Why We're Able to Google