NASA applies deep-diving text analytics to airline safety
- By Patrick Marshall
- Oct 26, 2012
This is the first of a four-part series about text analytics.
NASA’s Aviation Safety Program has been applying a powerful, emerging data tool to the business of making the skies more friendly, using text analytics to scan hundreds of thousands of unstructured text reports made by pilots, mechanics and other workers to find patterns that may help improve airline safety.
“We’ve been developing and implementing different text mining algorithms for analyzing aviation safety reports as well as other safety-related reports over several years,” Ashok Srivastava, project manager for the System-wide Safety and Assurance Technologies project for the Aviation Safety Program, told GCN. “By doing these kinds of analyses we hope to get a better understanding of what is going on in the aviation system with respect to different safety concerns.”
Airline flights generate a lot of data, but it tends to be scattered in different formats, from maintenance logs to air traffic reports to a plane’s “black box.” When something goes wrong, investigators pore over this data after the fact to look for the cause. The idea behind NASA’s program is to collect and analyze all that data on a regular basis and identify potential problems before they occur.
Specifically, the focus is on the reports submitted to the Aviation Safety Reporting System, a NASA program that collects incident reports from pilots, air traffic controllers and others. “It’s a remarkable database,” Srivastava said.
“If you look at these reports you can find discussions from pilots about certain incidents, you can also see issues that are coming up that are mechanical, or passenger safety concerns. One of the key issues that we are interested in addressing is, why do aviation safety incidents occur? What are the precursors? What are the drivers to different safety incidents? The technologies are giving us new ways of developing that insight.”
Before Srivastava’s team started applying text analytics, the data was reviewed only by human analysts. And while humans haven’t been taken entirely out of the loop, they can’t catch patterns that occur across and between disparate reports as effectively as text analytic programs.
Text analytics uses algorithms to search for words, phrases and patterns in unstructured text documents, using linguistic and/or statistical techniques to mine data on a large scale.
The team’s initial analysis efforts employed natural language processing (NLP) techniques. “That got us to a certain point,” Srivastava said. “But we started to make a shift toward using more statistical methods for analyzing the data based on machine learning.”
NLP methods involved tagging a lot of words and phrases using human-built rules that were encoded into the computer system, he said. And then that information is used to analyze the text and determine what type of anomaly it was describing – a runway incursion, a bird strike, etc. The problem is that writing all those rules is very human labor intensive.
“The machine-learning approach is very different,” Srivastava said. “It takes all of the data and a few examples of the way different reports are categorized and then we developed statistical techniques to take documents and predict which category they fell into. So it didn’t require the same degree of rule building as in natural language processing. It reduced the amount of cost involved in analyzing the data because it didn’t require the handcrafted rules.”
Srivastava said the project is not entirely without controversy. “One of the things that we are really interested in doing in the future is analyzing in tandem the text documents with the numerical data that come from the flight data recorder,” he said. “But the carriers don’t let text reports get linked with the flight data recorders. I think there are number of issues. There are privacy concerns.”
Nevertheless, Srivastava’s team is making a mark. Southwest Airlines, for example, uses NASA’s data in its safety program. “Our technology has been transferred to major carriers in the United States and to a number of agencies, including the Federal Aviation Agency,” he said.
TOMORROW: The secret to text analytics’ power — and why some agencies keep their work with it a secret.
Patrick Marshall is a freelance technology writer for GCN.