This is an archived version of CCC's website. Please visit the new ccc website for the latest information.

Relevant Links

Press Release
Research Papers
Media Contact


machine learning, linguistics, India, ancient languages


feed icon

feed icon

feed icon


Machine Learning Applied to Indus Script

Examples of the Indus script. The four square artifacts with animal and human iconography are stamp seals that measure one or two inches per side. On the top right are three elongated seals that have no iconography, as well as three miniature tablets (one twisted). The tablets measure about 1.25 inches long by 0.5 inches wide. The Rosetta Stone allowed 19th century scholars to translate symbols left by an ancient civilization and thus decipher the meaning of Egyptian hieroglyphics.

But the symbols found on many other ancient artifacts remain a mystery, including those of a people that inhabited the Indus valley on the present-day border between Pakistan and India. Some experts question whether the symbols represent a language at all, or are merely pictograms that bear no relation to the language spoken by their creators.

Rajesh Rao, a University of Washington computer scientist, has led a statistical study of the Indus script, comparing the pattern of symbols to various linguistic scripts and nonlinguistic systems, including DNA and a computer programming language. The results, published online by the journal Science, found the Indus script's pattern is closer to that of spoken words, supporting the hypothesis that it codes for an as-yet-unknown language.

"We applied techniques of computer science, specifically machine learning, to an ancient problem," said Rao. "At this point we can say that the Indus script seems to have statistical regularities that are in line with natural languages."

Co-authors are Nisha Yadav and Mayank Vahia at the Tata Institute of Fundamental Research in Mumbai, India; Hrishikesh Joglekar, a software engineer from Mumbai; R. Adhikari at the Institute of Mathematical Sciences in Chennai, India; and Iravatham Mahadevan at the Indus Research Center in Chennai. The research was supported by the Packard Foundation and the Sir Jamsetji Tata Trust.

Rajesh P.N. Rao (University of Washington)

Nisha Yadav (Dept. of Astronomy & Astrophysics, Tata Institute of Fundamental Research, Mumbai 400005, India, and Centre for Excellence in Basic Sciences, Mumbai 400098, India)

Mayank N. Vahia (Dept. of Astronomy & Astrophysics, Tata Institute of Fundamental Research, Mumbai 400005, India, and Centre for Excellence in Basic Sciences, Mumbai 400098, India)

Hrishikesh Joglekar (14, Dhus Wadi, Laxminiketan, Thakurdwar, Mumbai 400002 India)

R. Adhikari (The Institute of Mathematical Sciences, Chennai 600113, India)

Iravatham Mahadevan (Indus Research Centre, Roja Muthiah Research Library, Chennai 600113, India)

Research support provided by:
David and Lucille Packard Foundation, Sir Jamsetji Tata Trust

Current Highlight | Past Highlights

Computing Research Highlight of the Week is a service of the Computing Community Consortium and the Computing Research Association designed to highlight some of the exciting and important recent research results in the computing fields. Each week a new highlight is chosen by CRA and CCC staff and volunteers from submissions from the computing community. Want your research featured? Submit it!.