Research zeroes in on handling large data sets

Research zeroes in on handling large data sets


REDONDO BEACH, Calif.'Finding ways to deal with large data sets, archives and collections is attracting government research dollars.

At the National Conference on Digital Government Research last month, researchers presented papers on future technologies to improve the collection, organization and dissemination of large and distributed databases.

For example, Sherri Harms, a computer science professor at the University of Missouri, demonstrated an experimental data mining technique that could assimilate data from historical agriculture records, weather sensors throughout the world, crop models and other sources to come up with accurate local drought predictions.

Watch that data change

Her approach and other research projects receive grants from the National Science Foundation's Information Technology Research program, which sponsored the conference, dubbed DG.O.

According to Eduard Hovey, professor and co-director of the computational linguistics program at the University of Southern California, geospatial data is a hot research area. Viewing data changes over time is a particular challenge, he said.

'The brain organizes things temporally and spatially,' but databases don't, he said.

Another topic of intrigue is how to gather large amounts of data in a short span of time.

'At tax time, the IRS has the need to upload millions of documents overnight'they need it today but not tomorrow,' Hovey said. Several academic researchers are working on similar problems.

Eternal backups

Still another research focus, said Deborah Noble, external relations manager for the University of Southern California's Information Sciences Institute, is maintaining and migrating data that agencies must, by law, archive in perpetuity.

USC's San Diego Supercomputing Center is working with the National Archives and Records Administration on the problem, which extends to government videos and Web sites.

This type of research is popular with Congress and the new administration, said Larry Brandt, program manager for NSF's Digital Government Research program. Two years ago, his program was funded at $2.2 million and this year at $8.4 million.


  • business meeting (Monkey Business Images/

    Civic tech volunteers help states with legacy systems

    As COVID-19 exposed vulnerabilities in state and local government IT systems, the newly formed U.S. Digital Response stepped in to help. Its successes offer insight into existing barriers and the future of the civic tech movement.

  • data analytics (

    More visible data helps drive DOD decision-making

    CDOs in the Defense Department are opening up their data to take advantage of artificial intelligence and machine learning tools that help surface insights and improve decision-making.

Stay Connected