The "Mayor's Geek Squad" builds a DataBridge to integrate data from 40 agencies and find previously unknown patterns and relationships for the city to follow up on.
New York City is taking urban predictive data analytics to new heights.
Project at a glance
Project: DataBridge and analytics warehouse
Office: New York Mayor’s Office of Data Analytics
Technology used: SAS Analytics for data analysis; Oracle database as a foundation for the warehouse; Palantir’s data fusion software
Cost: $1 million investment for initial infrastructure
The most populous city in the United States — with 8 million residents — faces huge fiscal challenges, which is driving city officials to make tough decisions on which citizen services to support. However, an eight-member team of data analysts is helping agencies apply data and analytics to better manage and allocate those resources. Dubbed the “Mayor’s Geek Squad,” members of The Mayor’s Office of Data Analytics have created the DataBridge, a common data source from which agencies can access and extract a trove of agency regulatory data.
DataBridge unites formerly stove-piped information on a single platform, allowing for cross-departmental data analysis from 40 different agencies. By applying analytics, MODA finds previously unknown patterns and relationships that lead to better decisions and resource allocation.
Over the past three years the squad has doubled the city’s hit rate in finding stores selling bootleg cigarettes, accelerated the removal of trees destroyed by Hurricane Sandy and directed housing and fire inspectors to structures that have been illegally sub-divided or are at risk of catching fire.
Last fall, officials with the city’s Department of Environmental Protection cracked down on restaurants that were illegally dumping cooking oil into sewers in their neighborhoods, clogging up drains in the process. The health department typically would send inspectors to restaurants on blocks with backed-up sewers and hope by chance to catch someone pouring the contents of a deep fryer into the street. MODA was able to compile data from the Business Integrity Commission, which certifies that all local restaurants have a carting service to haul away their grease. Through several quick calculations the team compared restaurants that did not have a carter with geo-spatial data on the sewers. Then they were able to give inspectors a list of statistically likely suspects.
The whole data-driven approach is successful because the eight-member MODA team has worked closely with its agency partners to ensure that their requirements are being met. “If you don’t do this you might come up with solutions that don’t make sense or ones that people do not have the resources to implement,” said Lauren Talbot, chief programming analyst with MODA.
“All of the agencies are focused on delivering high-quality services” and embrace solutions such as analytics that can help them better provide core services, said Chris Corcoran, deputy analytics officer with MODA.
The DataBridge is a combination of technologies. The foundation is a data warehouse with a suite of SAS Analytic tools and data fusion software from Palantir. Agencies can patch in data from their sources and perform analysis on that information, Corcoran said. In addition to putting data in one place, MODA uses a geocoding system that lets the team associate geoidentifiers with addresses and other geographic information. This allows data from multiple agencies to be merged and used in one place, Talbot said. Prior to DataBridge, analysts could not run one agency’s data to predict an outcome of another agency’s regulatory area, but now they can.
For example, the New York City Fire Department is applying data and analytics to change the way FDNY conducts daily building inspections, helping the city’s 341 fire units more accurately target for inspection buildings that are potential fire risks. The Risk Based Inspection System mines information from databases across the city to help prioritize the 50,000 buildings firefighters inspect annually.
FDNY built its own data warehouse, where the department could store inspection information, such as the building’s occupancy class, whether it has sprinklers or if it is fire-proofed. The Risk Based Inspection System pulls information from the FDNY data warehouse as well as from databases from the City Planning, Buildings, Environmental Protection and Finance departments, using the DataBridge.
The system lets FDNY prioritize inspections based on specified risk criteria, such as the type of building (home, storefront, manufacturing facility), the construction material, the building’s fire-proof features, the height and age of the building, the last inspection date, occupancy and violation history.
Meanwhile, the NYC Department of Buildings has judiciously applied data analytics to handle illegal conversion complaints, city officials said. The city receives 20,000 to 25,000 complaints of illegal conversions every year. An illegal conversion is an apartment or house with residents living above maximum occupancy, often remnants of formerly legal spaces that have been divided making them unsafe for occupancy. A single-family home, for example, could be subdivided to house 30 individuals in crowded, unsafe conditions.
Illegal conversions represent significant public safety hazards from fire, crime and diseases. The NYC Department of Buildings has approximately 200 inspectors to look into those complaints, in a city of nearly a million buildings. Using data from 19 agencies, MODA built a file of all buildings in the city to help the city prioritize complaints that represent the greatest catastrophic risk.
Analysts looked at a range of information, such as whether or not an owner was in arrears on property taxes, if a property was in foreclosure, the age of the structure, and then cross-tabulated that data against five years of historical fire data of all of the properties that had structural fires in the city, arranged by severity. The MODA team found certain high-risk indicators that correlated to structures that had fires. MODA now runs new illegal conversion complaints against that file to identify those complaints that represent the top 5 percent for fire risk. The complaints are sent to inspectors to follow up with urgency.
In the past, building inspectors responding to complaints found seriously high-risk conditions 13 percent of the time. Now, they are finding these risky conditions 70 to 80 percent of the time, a five-fold return on inspection man hours, according to MODA.
Working with the Building Department, MODA is now focusing on analyzing a larger set of illegal conversion complaints using SAS Analytics, Corcoran said. MODA is applying risk filters to flag high-priority complaints within the two minutes that a complaint is registered and printed out at the bureau command office.
A lot of work goes on behind the scenes to put the data in a format that is ready for analysis, Talbot said. “We’re working with real-world data. When it comes to us, it isn’t always perfect and it can be complicated to figure out,” she said. Bringing data to life requires real creative processes, and technology can help. SAS’ data integration and quality capabilities, to give one example, are critical to producing useful data, Talbot noted.
MODA also uses a variety of other tools, ranging from basic analysis in Microsoft Excel, Microsoft SQL Server for data access, Oracle business intelligence tools to perform data look-up and Palantir for relationships and network mapping. The tools are designed to provide solutions for various classes of data analysts, from casual users to sophisticated programmers.
The real focus is to enhance public safety, whether that involves responding to building complaints or identifying the most critical areas to bring back online during an outage. The goal is to apply data analytics so “public safety can be [viewed] in a more intelligent way,” Corcoran said.
NEXT STORY: Juniper app works to correct GPS errors