USDA's high-res view of fraud

It doesn't have a catchy name like Batman or the Green Lantern, but the Crop Insurance Program Compliance and Integrity Data Warehouse is an effective, and innovative, crime fighter. It combs through mountains of data looking for atypical patterns among insurance claims, cross-checking them with data from high-solution satellite images and weather records. At stake are billions of dollars.

Project at a glance

Project: Crop Insurance Program Compliance and Integrity Data Warehouse

Team: USDA Risk Management Agency and the Center for Agribusiness Excellence at Tarleton State University

Technology used: Teradata Database 14 and custom software.

Time to implement: Started in 2000.

Cost: $50.68 million

In the mix: 170 data sources; 3 terabytes of RMA policy information; 120 terabytes of weather, satellite and other remotely sensed data; 1.3 million crop insurance policies; 3,200 counties

The project, run by the Agriculture Department’s Risk Management Agency and developed and maintained by Tarleton State University's Center for Agribusiness Excellence, was designed to identify fraudulent crop insurance claims.

That's a more challenging task than it might seem at first glance. After all, the Federal Crop Insurance Corporation, which is overseen by RMA, has more than a million policies outstanding in 3,200 counties. When drought afflicts farms in West Texas or floods drown corn fields in Iowa, sending agents out to confirm each claim is simply not feasible. 

Concerned about fraudulent crop-loss claims, Congress passed the Agriculture Risk Protection Act of 2000 (ARPA), which mandated the use of a data warehouse and data-mining technologies to improve crop insurance program compliance and integrity. Accordingly, RMA, which had already been moving in that direction, launched its data-mining project with Tarleton State's Center of Agribusiness Excellence. 

The team started with the basics, collecting and comparing claims data and looking for unusual patterns. Is one farmer making claims that are different than those coming from other farms in the region? 

When the program detects such a pattern, the unit will send out a letter saying that a representative from USDA may come out at some point during the year and inspect the farm’s operation. 

"After notifying the farmers, we saw pretty drastic behavioral changes in the producers and in their claim rates," said Kirk Bryant, deputy director of strategic data acquisition and analysis at RMA. "After we sent a letter or inspected their farms, their claims were consistent with the other claims in the county."

While the first "spot check list" was generated in 2001 solely from claims data, the program has since added data from many different sources. 

The first step was to add data collected by the Farm Service Agency, including aerial imagery, crop data and information about farm loans and disaster assistance. "Through the data-mining facility we could do 'scrubbing,' and match the data between FSA and RMA," Bryant said.

The project next added data from the Natural Resources Conservation Service, which conducts soil surveys for the country.

In 2006 the team began to integrate satellite data. At first, the data was supplied by NASA's MODIS (Moderate Resolution Imaging Spectroradiometer) satellite. "We wanted to be able to use an objective measure of vegetative health to compare against crop claims," said Bert Little, executive director of Tarleton State University's Center for Agribusiness Excellence. "In 2008 we put out a preliminary paper showing that we could tell the difference between irrigated and non-irrigated farming practices in cotton in West Texas."

With the launch of the Landsat 8 satellite earlier this year, the project has gained access to higher resolution images and data, including near-red and infrared scans. "What that gives you back is essentially the greenness that is reflected by chlorophyll in plant leaves," Little said. "The greener that signal, the healthier the plant.” That can help show if there was a viable crop on land a farmer claimed he was not able to plant.

And it's not just a matter of detecting plants, since a field bordered with trees or overgrown with weeds could produce a false positive. Thanks to higher-resolution data from Landsat 8, however, the project can now distinguish between systematic growth, which is indicative of crops, and chaotic growth from weeds. 

"We've written code so that the computer can go back and evaluate the satellite signal from fields,” Little said.

Given the immense number of farms in the country and the variability of weather events, the more the process can be automated, the better. "We're trying to look at millions of fields across the United States and get it down to a very small pool that some human could really handle evaluating," he said. 

RMA and Center for Agribusiness Excellence have built a data warehouse – which resides on Tarleton’s Texas campus and runs on Teradata Database 14 – that draws data from more than 170 data sources, including 3 terabytes of RMA policy information that has been connected to 120 terabytes of weather, satellite and other remotely sensed data collected by the university. Apart from using Teradata Database 14 platform, software development has taken place at the Center for Agribusiness Excellence. "We're doing all of this in-house," Little said. Off-the-shelf software is good for routine tasks, “but when you're doing exploratory studies you have to build your own tools."

The payoff

To date, USDA has spent $50.68 million on the program. According to RMA, the spot-check-list project alone has resulted in savings of $975 million in unjustified claims payments from 2001 through 2012. What's more, it is estimated that the program has saved $2.5 billion in cost avoidance. 

While the primary payoff has been in preventing fraudulent claim payments, the system has also benefited some farmers who would incorrectly have been denied claims. In one instance, two farmers were initially denied their claims for hail damage because the National Oceanic and Atmospheric Administration could not verify that a hail storm had occurred on the day in question. The Center for Agribusiness Excellence, however, was able to locate recorded NEXRAD radar data in the data warehouse that indicated a very isolated, very heavy storm that produced the damage.

The program also has served to demonstrate the effectiveness of data mining to insurance companies. Once insurers saw the results being generated by the program, "they wanted to direct their quality control programs through data mining as opposed to doing random sampling," Bryant said. "It is so much more effective, and everything is cost-benefit driven."

"We have come light years since we started this process," said Michael Hand, RMA's deputy administrator for compliance. "Back in the beginning all we knew about remote-sensing tools was we'd see a pretty image every now and then of a farm. Now we are actually using the data from the satellites and incorporating them in our business processes."

Next steps

Officials at RMA and the Center for Agribusiness Excellence expect more benefits as the available data improves. 

Bryant, in fact, sees the capabilities the team is developing being used for many other jobs in addition to preventing fraudulent crop claims. "In the future, we're looking to use this data to begin to do some proactive work in identifying problem areas in the country with different crops," he said.

And the quality of data is improving quickly. Little said his first priority is to integrate more of the Landsat 8 data. A single pixel of data from the older MODIS satellite covers roughly 11 to 13 acres, but a single pixel of data from Landsat covers a circle approximately 50 feet in diameter.

With the higher resolution data, "We can do what we're doing much better -- and we can do more specific things," he said. He also expects that the day is not far off when the program will be able to differentiate among different types of crops. "Each crop has its own special signature of reflected light," and satellite-based sensors can pick up that data, he said. 

"What we're doing is bringing more and more empirical evidence into the crop insurance program so that those naysayers who claim that it's rife with waste, fraud and abuse won't have a leg to stand on," Little said. "When you're working with things that affect people's livelihood and their freedom you want to make sure that you are 100 percent or as close as possible on the point when you render an opinion.”

Read about more 2013 GCN Awards winners.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected