yellow canary on digital background

Protecting government secrets with an AI-powered canary trap

Researchers at Dartmouth College’s Department of Computer Science have developed an artificial intelligence system that automatically creates different decoy versions of a document to foil adversaries.

WE-FORGE is type of canary trap, a technique that plants different instances of false information in documents. If one of those documents is leaked, the canary will “sing,” identifying the leaker.

A type of canary trap was used during World War II when British intelligence agents planted false documents on a corpse to trick Nazi Germany into preparing for an assault on Greece while the Allies invaded Sicily. The term was made popular by Jack Ryan in Tom Clancy’s Patriot Games for a “barium meal test,” which in espionage circles describes when a fake secret is given to a suspected enemy, who is then monitored to see whether he swallowed the bait and passed it on.

The process is relatively simple when creating a small number of variations in a handful of documents, but for large scientific or technical documents, it’s difficult to quickly produce deliberately incorrect versions that can fool thieves.

State secrets, patents, military specs and drug design data can include massive numbers of concepts that would need to be falsified for a canary trap. WE-FORGE can consider millions of possibilities for all of the concepts that might need to be replaced in a single technical document.

WE-FORGE uses natural language processing to generate multiple fake files that are believable yet incorrect, Dartmouth officials said in their announcement. It also inserts an element of randomness to keep adversaries from easily identifying the real document.

"The system produces documents that are sufficiently similar to the original to be plausible, but sufficiently different to be incorrect," said V.S. Subrahmanian, Dartmouth’s distinguished professor and director of the Institute for Security, Technology, and Society.

WE-FORGE can be used to create numerous fake versions of any technical document, forcing adversaries to spend time figuring out which of the many similar documents is real. “Even if they do, they may not have confidence that they got it right," Subrahmanian said. "This system raises the cost that thieves incur when stealing government or industry secrets."

"WE-FORGE can also take input from the author of the original document," said Dongkai Chen, a graduate student at Dartmouth who worked on the project. "The combination of human and machine ingenuity can increase costs on intellectual-property thieves even more."

As part of the research, the team asked a panel of knowledgeable subjects to see if they could spot real computer science and chemistry patents among a group of falsified ones. According to the research, published in ACM Transactions on Management Information Systems, the WE-FORGE system was able to "consistently generate highly believable fake documents for each task."

About the Author

Susan Miller is executive editor at GCN.

Over a career spent in tech media, Miller has worked in editorial, print production and online, starting on the copy desk at IDG’s ComputerWorld, moving to print production for Federal Computer Week and later helping launch websites and email newsletter delivery for FCW. After a turn at Virginia’s Center for Innovative Technology, where she worked to promote technology-based economic development, she rejoined what was to become 1105 Media in 2004, eventually managing content and production for all the company's government-focused websites. Miller shifted back to editorial in 2012, when she began working with GCN.

Miller has a BA and MA from West Chester University and did Ph.D. work in English at the University of Delaware.

Connect with Susan at [email protected] or @sjaymiller.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected