Wednesday, July 7, 2010

Ancient Language Deciphered By Computer

The lost language of Ugaritic was last spoken 3,500 years ago in the city of Ugarit, located in modern Syria. Today, it survives on only a few tablets, and linguists were only able to translate it with years of hard work and some luck. Yet, a new computer program deciphered it in just hours.

Created by Regina Barzilay, an associate professor in MIT’s Computer Science and Artificial Intelligence Lab, Ben Snyder, a grad student in her lab, and the University of Southern California’s Kevin Knight, the computer program relies on a few basic assumptions in order to make intuitive guesses about the language's structure. One of the requirements for the program to work, the lost language must be closely related to a known, deciphered language. In the case of Ugaritic that relative language is Hebrew. Another requirement is that the alphabets of the two languages must share at least some consistent correlations between the individual letters or symbols. The program worked by looking for correlations and correspondences between the two languages, then it mapped the similarities between Hebrew and Ugaritic
The results were stunning. Of the thirty letters in the Ugaritic alphabet, the computer correctly identified twenty-nine of them. Of the roughly third of all Ugaritic words that share Hebrew cognates, the program figured out sixty percent of them, and many of the errors were only off by a letter or two. These results are particularly encouraging because the program still doesn't use any contextual clues, meaning it can't differentiate between the different uses of a Ugaritic word that means both "daughter" and "house", something that is (thankfully) pretty easy to identify in context.
After Ugaritic was first discovered in 1929, it remained untranslatable for years. It was only through happy coincidence that it ever was translated. The computer program, however, was able to get this far in simply a matter of hours. The possibilities this program offers for speeding up the translation process of ancient documents are readily apparent. Additionally, the program could help improve online translation software.

The Ugaritic language is a Semitic relative of Hebrew, though its alphabet resembles the cuneiform used in ancient Sumeria. The Ugaritic texts that survive tell the stories of a Canaanite religion that is similar to that recorded in the Old Testament. The differences between the two texts provide scholars a unique opportunity to examine how the Bible and ancient Israelite culture developed in relation to those found nearby.

No comments:

Post a Comment