covid virus thermometer (kostasgr/

Can machine learning predict coronavirus spread?

With so few answers to so many coronavirus questions, scientists at the Lawrence Berkeley National Lab are applying machine learning to health, environmental and social datasets, looking for insights on what influences the transmission of the SARS-CoV-2 virus, which causes COVID-19.

“Environmental variables, such as temperature, humidity and UV [ultraviolet radiation] exposure, can have an effect on the virus directly, in terms of its viability,” said Berkeley Lab scientist Eoin Brodie, deputy director of Berkeley Lab’s Climate and Ecosystem Sciences Division and the project’s lead. The climate and weather also impact social behavior, which in turn influences the way the virus spreads through a community, according to a Berkeley Lab report.

The cross-disciplinary research team will tap into county-level health data about the severity, distribution and duration of the COVID-19 outbreak along with information on what public health interventions were implemented when. They’ll combine that data with high-resolution climate models, seasonal forecasts, demographics and population mobility data generated by smartphones.

The goal is “to separate the contributions of social factors from the environmental factors to attempt to identify those environmental variables to which disease dynamics are most sensitive,” Brodie said. With that data in hand the team hopes to be able to predict – for each county in the United States – how environmental factors influence the transmission of virus.

The problem is similar to predicting how weather impacts watersheds and agriculture, requiring the integration of data across scales to make predictions at the local level. “Downscaling of climate information is something that we routinely do to understand how climate impacts ecosystem processes,” Brodie said. “It involves the same types of variables – temperature, humidity, solar radiation.”

“We don’t necessarily expect climate to be a massive or dominant effect in and of itself,” said Ben Brown, a computational biologist in Berkeley Lab’s Biosciences Area who is leading the machine-learning analysis.

While contact rates are still the dominant influence on the spread of virus, it has been spreading slower in the southern hemisphere where it’s currently fall. “There are cities where it behaves as if it’s the most infectious disease in recorded history. Then there are cities where it behaves more like influenza,” Brown said.

“Looking at New York and California for example, even accounting for the differences between the timing of state-instituted interventions, the death rate in New York may be four times higher than in California,” he said.  “Understanding the environmental interactions may help explain why these patterns appear to be emerging,” Brown said.

The Berkeley Lab team believes that enough data may now be available to determine what environmental factors can influence the virulence of the virus. “This is a quintessential problem for machine learning and AI,” Brown said.

The computing work will be conducted at the National Energy Research Scientific Computing Center, an Energy Department facility located at Berkeley Lab. NERSC is a member of the COVID-19 High Performance Computing Consortium and has reserved a portion of its time on Cori, a Cray XC40 supercomputer, to support COVID-19 research efforts.

The team hopes to have the first phase of their analysis available by late summer or early fall. The next phase will be to make projections under different scenarios, which could aid in public health decisions, lab officials said.

“We would use models to project forward, with different weather scenarios, different health intervention scenarios – such as continued social distancing or whether there are vaccines or some level of herd immunity – in different parts of the country. For example, we hope to be able to say, if you have kids going back to school under this type of environment, the climate and weather in this zone will influence the potential transmission by this amount,” Brodie said. “That will be a longer-term task for us to accomplish.”

About the Author

Connect with the GCN staff on Twitter @GCNtech.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected