States wary of privacy-protected census data
- By Susan Miller
- Apr 22, 2021
Just as 16 states have signed on to a suit filed by Alabama calling for the Census Bureau to stop applying differential privacy to the population numbers that will be used to determine legislative seats, statistical benchmarks and infrastructure and school funding, the Census Bureau is releasing new demonstration data.
Alabama argued in its brief that differential privacy introduced too much error into the count -- sacrificing accuracy to ensure privacy. Data errors, especially for small geographic locations, make accurate redistricting impossible, the state said.
Internal emails from a Census Bureau expert, which were attached to Alabama’s brief, raised similar red flags.
According to an April 21 report in Bloomberg Law, James Whitehorne, chief of the bureau’s Redistricting and Voting Rights Data Office, wrote to Census’ Chief Scientist John Abowd with concerns about nine sparsely populated counties in Texas, Nebraska, Montana and Alaska that might have difficulty drawing voting districts with data protected by differential privacy.
The states supporting Alabama’s challenge are Alaska, Arkansas, Florida, Kentucky, Louisiana, Maine, Mississippi, Montana, Nebraska, New Mexico, Ohio, Oklahoma, South Carolina, Texas, Utah and West Virginia. The states also challenged delays in the release of redistricting data from March 31 to August at the earliest, a change that has sent many states scrambling ahead of fall elections.
The use of differential privacy aims to prevent anyone from “learning about the participation of an individual in a survey by adding tailored noise to the result of any query,” according to a 2020 JASON report on the technology. It makes it “possible to publish information about a survey while limiting the possibility of disclosure of detailed private information about survey participants.”
The Census Bureau introduced differential privacy because “traditional statistical disclosure limitation methods, like those used in 2010 census, cannot defend against modern challenges posed by enormous cloud computing capacity and sophisticated software libraries,” Abowd said in an April 13 court filing.
On April 19, the Census Bureau announced the April 30 release of new demonstration data would satisfy state redistricting accuracy targets.
“The next iteration of demonstration data will establish that differential privacy protections can produce extremely accurate redistricting data,” Census’ response to the Alabama suit states. The upcoming release “will demonstrate that the differential-privacy algorithm, “when properly tuned, ensures that redistricters can remain confident in the accuracy of the population counts and demographic characteristics of the voting districts they draw, despite the noise in the individual building blocks.”
After the April 30 release, data users will have at least four weeks perform their analyses and submit feedback.
Susan Miller is executive editor at GCN.
Over a career spent in tech media, Miller has worked in editorial, print production and online, starting on the copy desk at IDG’s ComputerWorld, moving to print production for Federal Computer Week and later helping launch websites and email newsletter delivery for FCW. After a turn at Virginia’s Center for Innovative Technology, where she worked to promote technology-based economic development, she rejoined what was to become 1105 Media in 2004, eventually managing content and production for all the company's government-focused websites. Miller shifted back to editorial in 2012, when she began working with GCN.
Miller has a BA and MA from West Chester University and did Ph.D. work in English at the University of Delaware.
Connect with Susan at [email protected] or @sjaymiller.