Agencies that use online feedback sites to inform regulations may be vulnerable to manipulation by machine-learning tools that can generate infinite variations of fake comments at scale.
Artificial intelligence may make it easier than ever to distort what is supposed to be a critical avenue of public feedback into another tool of manipulation, according to paper published this month in Technology Science.
In 2017, millions of comments addressing the end of net neutrality were uploaded to the Federal Communications Commission's comment site. Of the millions of comments submitted, the vast majority were generated using unsophisticated search and replace techniques, wherein the same base comment was repeatedly submitted with several key words changed each time, making it relatively simple for researchers to spot and separate them from the real, organic feedback.
Harvard College senior Michael Weiss, however, argued in his paper that emerging technology is now capable of developing "highly convincing natural language generation that ... makes it nearly impossible to distinguish whether online speech originated from a person or a computer program." Such speech, dubbed "deepfake text," demonstrates how federal agencies' reliance on online feedback for its rules and regulations has "become highly vulnerable to automated manipulation by motivated actors."
Out of the 22 million comments submitted during the net neutrality debate, 17 million (or about 77%) supported repealing the Obama-era rules. When researchers stripped out the likely fakes, only 800,000 authentic responses remained, and they told a dramatically different story. Researchers found the vast majority of the authentic comments supported net neutrality, as high as 99% in some cases.
Machine learning tools can generate nearly infinite variations of fake speech at scale. Using a natural language processing framework developed by OpenAI, a bot program and a proxy server, Weiss turned his attention to a public comment page on Medicaid.gov dealing with a proposed waiver for Idaho residents, where he generated more than a thousand "highly relevant" and unique 75-100 word comments.
Most of the comments opposed the waiver, but others were supportive or neutral, helping to make the illusion more realistic. They ranged from sentiments that residents would be buried in red tape and assertions that the new rules would not meaningfully improve employment among recipients to arguments that the waiver was immoral and would create a new costly government program. There were even comments from bots using their own fake family experiences make their case. Many used completely different writing styles, attitudes and insight.
In all, the fake text made up more than half of the 1,810 comments submitted for the proposal, after which Weiss voluntarily withdrew them. At a certain point the site began blocking comments from one of the IP addresses, but the others were able to continue their submissions undetected.
"This project shows that deepfake comments can be submitted to and accepted by a federal public comment website at scale and, once submitted, cannot be distinguished from other comments by human inspection," Weiss wrote.
The findings point to a potentially ominous future for how the federal government processes feedback from the public online. As Weiss pointed out, most federal agencies design their comment systems for convenience, with few if any protections in place to prevent spamming by bots.
An investigation by the Senate Homeland Security and Governmental Affairs Committee looking at 14 agencies' processes for submitting public comments found many of the same problems, and It's unclear what policymakers can do to address the problem.
Blocking of IP addresses can be evaded relatively easily, and the more aggressive an agency is, the more it risks doing the very thing it's fighting against: drowning out legitimate voices from citizens. Both the Senate investigation and Weiss recommended implementing CAPTCHA technologies, while other options like two-factor authentication could help, though they're far from foolproof.
Weiss argued that rather than a single permanent solution, it will likely be a long-term battle between technologies that detect and obfuscate such behavior. Still, it may be the best path with the least collateral blowback.
"One could imagine a smorgasbord of policy big sticks with threats and criminal penalties," Weiss wrote. "But society seems better off playing the technology cat-and-mouse game than risking draconian policies that may drive the ability to actually witness imbalances and fix them."
This article was first posted to FCW, a sibling site to GCN.
NEXT STORY: Acquisition leans into automation