Thwarting adversarial AI with context awareness
- By Stephanie Kanowitz
- Sep 23, 2020
Researchers at the University of California at Riverside are working to teach computer vision systems what objects typically exist in close proximity to one another so that if one is altered, the system can flag it, potentially thwarting malicious interference with artificial intelligence systems.
The yearlong project, supported by a nearly $1 million grant from the Defense Advanced Research Projects Agency, aims to understand how hackers target machine-vision systems with adversarial AI attacks. Led by Amit Roy-Chowdhury, an electrical and computer engineering professor at the school’s Marlan and Rosemary Bourns College of Engineering, the project is part of the Machine Vision Disruption program within DARPA’s AI Explorations program.
Adversarial AI attacks – which attempt to fool machine learning models by supplying deceptive input -- are gaining attention. “Adversarial attacks can destabilize AI technologies, rendering them less safe, predictable, or reliable,” Carnegie Mellon University Professor David Danks wrote in IEEE’s Spectrum in February. “However, we do not necessarily need to worry about them as direct attacks on the decision-making machinery of the system. Instead, we should worry about the corruption of human situational awareness through adversarial AI, which can be equally effective in undermining the safety, stability, and trust in the AI and robotic technologies.”
The researchers “came up with … new ways of attacking these machine vision systems, and these are what we call context-aware attacks,” Roy-Chowdhury said. “The idea is that the scene context -- the relationship between the objects in a scene -- can be used to develop both better defenses as well as better attacks.”
Humans can recognize when objects are out of place based on the context of a scene, but without tweaks to deep neural networks (DNNs) in machine vision systems, computers cannot reliably do so.
“A stop sign occurs in a scene around a crossing – there is often a pedestrian crossing, there is the name of a street,” Roy-Chowdhury said. A speed limit sign, on the other hand, is usually found on the side the road where there are none of the objects typically found at intersections. Humans “can use that additional information, which is called the scene context, to come up with a better understanding of what the object is,” he said.
If a machine learning system’s training data has been altered via an adversarial AI attack so that it learns to interpret a stop sign with a sticker on it as a speed limit sign, the context of the scene -- the fact that a speed limit sign doesn’t belong in an intersection -- will trigger further analysis, Roy-Chowdhury said.
“That is the defense side of things,” Roy-Chowdhury said. On the attack side, researchers can now “design attacks so that you don’t misplace the particular object that you are trying to attack. So, you would probably have to change other aspects of the scene, also,” he added.
Besides susceptible to the insertion of deliberately deceptive training data, DNNs are also vulnerable to attack through perturbations, or changes, such as adding “(quasi-) imperceptible digital noises to an image to cause a DNN to misclassify an object in an image … [or] physically altering an object so that the captured image of that object is misclassified,” according to a research paper on which the DARPA project is based.
To defend against such attacks, the paper proposes the use of context inconsistency. This means Roy-Chowdhury and his team add changes to images to make the computers give the wrong answers, with the idea that, it will later lead to the design of better defenses.
“Our approach builds a set of auto-encoders … appropriately trained so as to output a discrepancy between the input and output if an added adversarial perturbation violates context consistency rules,” the paper stated.
The result was an improvement of more than 20% over a state-of-the-art context-agnostic method, the paper concluded.
Other efforts in adversarial AI defense include DARPA’s Guaranteeing AI Robustness against Deception program, which aims to develop theories, algorithms and testbeds to help researchers create models that can defend against a range of attacks.
Additionally, researchers at the Energy Department’s National Renewable Energy Laboratory are using adversarial training to enhance the resolution of climate data up to 50 times, making it better suited for assessing renewable energy sources.
The Army Research Office and the Intelligence Advanced Research Projects Activity are also studying ways to spot and stop Trojan attacks on AI systems. A broad agency announcement issued last year called for software that could automatically inspect AI and predict if it has a Trojan.
Stephanie Kanowitz is a freelance writer based in northern Virginia.