Me Translate Pretty One Day
Spanish to English? French to Russian? Computers haven't been up to the task. But a New York firm with an ingenious algorithm and a really big dictionary is finally cracking the code.
By Evan Ratliff
JAIME CARBONELL, CHIEF science officer of Meaningful Machines, hunches over his laptop in the company's midtown Manhattan offices, waiting for it to decode a message from the perpetrators of a grisly terrorist attack. Running software that took four years and millions of dollars to develop, Carbonell's machine – or rather, the server farm it's connected to a few miles away – is attempting a task that has bedeviled computer scientists for half a century. The message isn't encrypted or scrambled or hidden among thousands of documents. It's simply written in Spanish: "Declaramos nuestra responsabilidad de lo que ha ocurrido en Madrid, justo dos años y medio después de los atentados de Nueva York y Washington."
I brought along the text, taken from a Spanish newspaper transcript of a 2004 al Qaeda video claiming responsibility for the Madrid train bombings, to test Meaningful Machines' automated translation software. The brainchild of a quirky former used-car salesman named Eli Abir, the company has been designing the system in secret since just after 9/11. Now the application is ready for public scrutiny, on the heels of a research paper that Carbonell – who is also a professor of computer science at Carnegie Mellon University and head of the school's Language Technologies Institute – presented at a conference this summer. In it, he asserts that the company's software represents not only the most accurate Spanish-to-English translation system ever created but also a major advance in the field of machine translation.
My test alone won't necessarily prove or disprove those claims. Carbonell, a native Spanish speaker with a froggy voice, curly gray beard, and rumpled-professor chic style, could translate it easily. But throw the line into Babel Fish, a popular Web translation site that uses software from a company called Systran – the same engine behind Google's current Spanish translation tool – and it comes out typically garbled: "We declared our responsibility of which it has happened in Madrid, just two years and means after the attacks of New York and Washington."
Carbonell's laptop churns for a minute and spits out its own effort, which he reads aloud from the screen. "'We declare our responsibility for what happened in Madrid' – a somewhat better translation would be 'We acknowledge our responsibility'" he interjects – "'just two and a half years after the attacks on New York and Washington.' So, no interesting errors there," he concludes. "It got it right."
<snip>
Meaningful Machines has military connections, too. Right now, the Global Autonomous Language Exploitation program run by Darpa is aiming to complete an automated speech and text translation system in the next five years. Meaningful Machines is part of a team participating in that challenge, including the "surprise language" segment (in which teams are given a more obscure language and asked to build a translation system). The challenge sounds a lot like another attempt to create the sort of universal translator that has eluded MT for 60 years. But success seems much more plausible now than ever before.
More:
http://www.wired.com/wired/archive/14.12/translate.html