Machine translations: Good enough for government work?
- By Kathleen Hickey
- May 26, 2015
Even as consumer programs like Skype Translator are making it easier for people speaking different languages to communicate, governments are struggling to make information available to non-English speakers.
A presentation from the Department of Labor last year underscored that complying with anti-discrimination laws includes providing information in other languages to non-English speaking individuals in the community, including those speaking less-common languages. These translation requirements can include paper documents, webpages, online applications, texts, tweets and other social media.
“It is seldom, if ever, sufficient to use machine translation without having a human who is trained in translation available to review and correct the translation to ensure that it is conveying the intended message,” the presenters wrote.
But that’s not always possible for resource-strapped agencies.
An increasing number of government agencies, from federal to municipal, are relying on machine translation tools to help constituents and employees.
Both the New Hampshire Department of Revenue Administration and Virginia’s Department for the Blind and Visually Impaired (DBVI) are using machine-translation technology and including links to Google Translate to help users navigate their websites. Both list disclaimers on their site and caution users that the translations are not official. In New Hampshire’s case, the site notes that “translation services are provided ‘as is,’” acknowledging that some pages may not be accurately translated due to the limitations of the software. DBVI goes a step further by stating that “only the English version will be relied on by the department in its decision making and in court.”
Fairfax County Public Schools uses Google Translate to provide its website information in 90 languages, from Afrikaans to Zulu. But it acknowledges that it cannot guarantee the accuracy of the converted text and that some files, like graphics with text and PDFs, cannot be translated.
In some cases, however, a machine translation might not be good enough. The Army is having some success with applying computer power to human translation for medical procedures.
When the military started training doctors in Afghanistan, there were few medical manuals available in the local language, Dari, and few bilingual speakers of English and the Afghani language that knew medical terms, hindering proper medical care.
Using a combination of computer translation, computer scientists and Afghan doctors, the Army has been able to collect 6,000 medical phrases in Dari and compile these into medical reference manuals for Afghani medical teams. The books have been printed and distributed, and secondary products, including an Android "Army Phrase Book" app, have been developed to make broader use of the expertise captured in the translated phrases.
"Computers could never replace the human translator, but we look for ways to relieve some of the burden, especially in less-commonly used languages, like Dari, Pashto and Serbian," said Melissa Holland, chief for U.S. Army Research Laboratory’s multilingual computing research program.
With translation databases, sentences and phrases can be shared and reused, reducing time and effort and the dependence on the small number of bilingual subject matter experts, said Steve LaRocca, computer scientist and team chief at ARL.
"We've had people translating every day in Korea since about 1951, but we didn't save the data sets over those decades," LaRocca said. "The knowledge generated by all those people over all those years is gone."
Some sites, such as the World Digital Library website, don’t translate content, they deliver the information in several languages. All navigation tools, metadata, content descriptions and Twitter feeds are provided in seven languages: Arabic, Chinese, English, French, Portuguese, Russian and Spanish.
The site receives metadata and descriptions of the information in different languages, but uses English as its working language, said Jason Yasner, Operations Manager for WDL, told DigitalGov.
“Therefore, we have what we call a ‘pre-translation’ phase in the production process where we translate all non-English metadata into English,” said Yasner. This translated material is then reviewed, finalized, and sent to professional translators for translation into WDL’s additional six languages. Books, manuscripts, maps and other primary materials on the site are not translated but are presented in their original languages.
Some recommend machine-translation technology, with caveats. CapturaGroup, a Hispanic-focused digital communications company, recommends that website managers use both original, second-language content creation and translation – sometimes called transcreation – which takes translated content and adapts it for cultural relevance, the company said.
For entities trying to reach Hispanics in the United States, CapturaGroup recommends bilingual websites, with original Spanish copy written specifically for the U.S. Hispanics online market, or further tailored to regional markets if needed, such as Mexican Spanish for a Southern California audience. CapturaGroup cited www.GobiernoUSA.gov as an example of a website using both English and Spanish.
And while it's not yet a substitute for human translators, machine translation is making steady progress. The National Institute of Science and Technology is evaluating machine translation technology “focusing on exploring translation system limitations as well as measurement limitations on informal data genres.”
Initially the 2015 NIST Open Machine Translation evaluation will assess translations from Arabic and Chinese to English by audio and text including SMS/chat and telephone conversations. Unlike previous years, there will be no official primary metric.
Kathleen Hickey is a freelance writer for GCN.