Cloud powers survey of industrial seismic activity
- By Stephanie Kanowitz
- Apr 23, 2019
Seismologists at Los Alamos National Laboratory are using cloud computing to conduct a continent-scale survey for seismic signatures of industrial activity, cutting the time it took to complete the work from days to hours.
“To our knowledge, this is the first application of streaming cloud-based research in seismology,” Jonathan MacCarthy, a researcher at the lab’s Earth and Environmental Services division, said in the announcement.
The team rented 50 computing nodes in the Amazon Web Services cloud and configured a 20-node cluster of up to 200 on-demand computers that were coordinated to request about 2 terabytes of data from the data center for analysis of publicly available seismic data. “Initial results show our cluster (compared to a single computer’s performance) could accelerate our analysis almost 2 orders of magnitude for as little as $100 per day,” the researchers wrote in an Eos article earlier this month.
The team looked at 13 years of seismic data from two stations in Texas and processed it with and without cloud technology to determine the cost, time and other potential benefits associated with each method. It took 30 hours -- about one hour of processing time per station per year -- to get data from the Incorporated Research Institutes for Seismology Data Management Center (IRIS DMC) repository using a specialized web interface, apply the detector and write up results on a workstation.
Then, the team expanded the coverage area to include the U.S. National Seismic Network’s 100 stations, each of which has about 10 years of continuous data. It would take about 41 days to complete the effort without cloud. Using the cluster in the cloud, however, data was processed at a few minutes per station and year, completing the survey in 30 hours at a cost of $30.
Only calculation results were sent to researchers’ computer; no raw data was stored, according to lab officials. The result is the first large-scale map of industrial noise in the country, which can help scientists discriminate seismic noise from industrial activity.
“The ability to access data on demand from cloud systems represents a significant change for the data center, where usage has traditionally been limited by a researcher’s ability to orchestrate requests for and manage large data volumes,” MacCarthy and his co-researchers wrote in Eos. “With compute, storage, and bandwidth resources larger than many data centers, cloud systems make it easy for a single researcher to significantly affect a data center’s ability to service requests.”
IRIS DMC is assessing running its services directly in a cloud as part of its GeoSciCloud project, funded by the National Science Foundation. Key potential benefits of doing that would be allowing data access capacity to scale as needed and hosting the data next to large computing capacity, officials said.
Stephanie Kanowitz is a freelance writer based in northern Virginia.