To make the massive Home Mortgage Disclosure Act data more accessible to the public, the Consumer Financial Protection Bureau focused on user feedback as it built a browser that filtered data from over 35 million records.
When the Consumer Financial Protection Bureau decided to develop a browser tool that would make the massive Home Mortgage Disclosure Act (HMDA) datasets more accessible to the public, they didn’t design it from top to bottom and release it with a big splash. Instead, said Eric Spry, the HMDA operations program manager, CFPB focused on using “flexible architecture” that let the team get user feedback between iterations. “We started with user interviews in the Summer of 2018 to find out how people were using HMDA data, and what challenges they faced,” he said.
Home Mortgage Disclosure Act Data Browser
Consumer Financial Protection Bureau
This method helped Spry’s team know the features that were in most demand. When the majority of users they talked to made it clear they were only interested in data on their immediate community, the HMDA browsers were modified to filter data down to a specific state or county, or for a specific lender.
The next challenge was making the data accessible to people at all research skill levels. After some digging, Spry’s team found spreadsheets were most users’ “common denominator,” so they made sure the data could be downloaded in spreadsheet formats. “This gives the user just the records they want in a more manageable format, and without specialized software that very large datasets require,” he said.
What features not to include proved just as important Spry’s team. Their goal was to keep the tool simple and accessible. With the HMDA data browser pulling from a database of over 35 million records and growing, overdoing the options was a definite risk. “It’s always tempting to try and solve for all possible users and use cases, but complex projects quickly get bogged down by too many features,” said Spry.
The quantity of data also proved tough when trying to make sure the system moved quickly for users. To solve the problem, the team decided not to allow direct database inquiries. Instead, they came up with a caching methodology that considered most of the possible combinations of user queries and pre-generated the files. “Cloud resources allow us enough scale to process a massive number of cached files ahead of each data release,” Spry said. “But we did find that over optimizing has its own set of problems, and finding that right balance is the key.”
Moving forward, crowdsourcing and transparency look to be continuing themes with the new system – developed in the spirit of the HMDA itself. “The Home Mortgage Disclosure Act since 1975 has always been about public transparency of dwelling secured lending,” said Spry. “Our work at the CFPB continues that tradition of how we provide access to this historic public resource, and how data users’ needs are changing as well.”
Spry said the system is built to encourage an “ecosystem of citizen coders” to help them continue to improve. “We are very proud to share that the HMDA Data Browser and it’s underlying platform are all developed in the open, using public program code repositories on Github. This allows interested public, researchers, and software developers to review our program code, suggest changes, and contribute to improvements. We also build all of our work on public APIs, which means that users can build their own interfaces and tools from the resources we provide.”