Detecting Trojans in AI
What if the artificial intelligence system that runs an autonomous vehicle became infected by a Trojan? An adversary could modify some images of street signs and assign different labels to them so that a vehicle's computer vision algorithms see a stop sign but interpret it as a speed limit sign, and the car fails to stop.
Currently, there's not much security professionals can do to detect such a problem, nor can they defend against an attack by protecting the training data. AI systems are dependent on the security of the data and training pipeline, which often consists of massive crowdsourced datasets that are impossible to clean and monitor.
To address this problem, the Intelligence Advanced Research Projects Activity has announced the TrojAI program that aims to inspect AI for Trojans in systems where the algorithm has already been trained.
According to a broad agency announcement, IARPA wants software that can autonomously inspect a deep neural network that has been trained to classify small images into separate categories and predict if it has a Trojan that can trigger some kind of image misclassification. The software, along with documentation will be posted on GitHub or another public repository for others to freely use.
If TrojAI is successful in solving image classification problems, IARPA said it may expand the program to other types of AI or classification domains, such as audio or video.
IARPA intends to run the program for 24 months, with one base year and one option year. More information is available here.
Connect with the GCN staff on Twitter @GCNtech.