Agencies can now name names

Agencies can now name names

INS' Elizabeth Tisdale says spelling variations make tracking Chinese and Thai names difficult.

Shakespeare asked, 'What's in a name?' To 28 federal agencies that use the Interagency Border Inspection System (IBIS), it's a particularly vexing question.

But a new system developed by Language Analysis Systems Inc. of Herndon, Va., can help unlock some of the mystery.

The Name Reference Library Version 1.1 is an automated tool to help border inspection personnel and other investigators check names and discern the meaning and culture behind them.

'We would never consider assigning personnel to protect our borders without training in firearms and investigative techniques,' said Jack Hermansen, chief executive officer of Language Analysis Systems.

'But we provide them with almost nothing to help them with the complexities of names or on the operation of their name search system,' he said. 'Yet, people's names are usually the primary information that we have for tracking and investigating them.'

Additional data

NRL Version 1.1, a Web-ready application written in C++ and Hypertext Markup Language and released last month, provides users with instant assistance on names through a name analysis tool and a hypertext encyclopedia of information about names from many cultures.

For instance, when an Immigration and Naturalization Service inspector who is investigating the Chinese surname 'Chang' types the name, NRL displays name variants such as 'Zang' and 'Zhang' from the database, gender information, details on parts of the names, countries where the name most frequently occurs and name syntax, Hermansen said.

About 10 agencies'including the CIA, INS, State Department and the Social Security Administration'have tested NRL.

Signe Scarbrough, a management analyst for the Bureau of Alcohol, Tobacco and Firearms, said 13 workers tested the software for three months at the agency's Washington headquarters. A major problem for ATF users has been finding the origin of names and whether they are more likely male or female, she said.

'In Arabic names, there are a lot of two- or three-letter words, but these are not names, only titles,' she said. 'NRL helps to determine if that's a normal thing or if someone is reinventing their name to hide from the law.'

Elizabeth Tisdale, an assistant chief inspector at INS, said one of the biggest problems is that Chinese and Thai names are spelled differently in English.

NRL is aimed at helping agencies overcome such problems. The hypertext reference material contains four major categories of information: spelling variations, name parts, name syntax and cultural information.

Version 1.1 focuses on eight cultures: Arabic, Chinese, Hispanic, Indonesian, Korean, Thai, Russian and the Yoruban people of Nigeria, Hermansen said. Afghani, Japanese, French, German, Iranian and Vietnamese names will be added in the future.

The system's spelling variation information can explain how and why a name has more than one spelling. The data on name parts can help identify suffixes and prepositions such as 'Al-Din' in Arabic or 'de' in Hispanic names. The cultural information contains maps of countries, lists of common names and gender data. The syntax section provides information about the order of surnames and given names.

The standalone version of NRL runs under Microsoft Windows and requires Microsoft Internet Explorer 5.0. The network version runs under Windows NT 4.0 and Windows 2000.

The CIA began developing the system in 1996 and transferred the project to the Technical Support Working Group of the Counterterrorism Technology Support Office at the Defense Department in 1997. The government has spent about $3 million on the project since then.

IBIS members supported the project after finding that there were few reference applications available to provide name information.

'We wanted to collect information from the agencies to put it in a consistent form and then redistribute it,' Hermansen said. 'But there was nothing to collect. People had Xerox things stuck in their cubicles, clips from magazines, something that had been faxed to them, but nothing that could help them figure out if the name was Thai or Indonesian.'

Inconsistent fields

Many agencies have different ways of storing names, he said.

Some agencies store the first and the last name and some the first, middle and last name, Hermansen said. State, for example, has fields for the first given name, second given name, patronymic and matronymic names, which works well for Hispanic names but not for Arabic, Chinese or Indian names.

'What's more insidious and pernicious is that when you share a name with another agency which uses only one field for, say, the middle name, how does the computer interpret that?' he said. 'The fields are also of different lengths, so the end characters of a name may just get cut.'

To develop NRL, Language Analysis Systems collected information from sources such as immigration data, books, language specialists and native speakers. The application has a database of 300 million names and uses several algorithms to detect name variations.

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above