City uses a 'Google for machine data' to improve water use
- By Rutrell Yasin
- Aug 06, 2013
Denver Water, Colorado’s oldest and largest water utility, promotes the efficient use of water to 1.3 million people in the city and surrounding suburbs, but that job hasn’t always been easy.
The utility, a semi-government entity, relies on its Information Technology Division to identify and implement the appropriate applications and technology infrastructure to meet the utility’s business needs. Until recently, supporting the infrastructure and monitoring applications was a challenge. A deluge of machine data from logs and databases often overwhelmed IT administrators, hampering efforts to pinpoint problems when users notified the help desk. IT officials wanted to adopt a more proactive approach where administrators could solve application and system problems long before they impacted the users of these IT resources.
“We need to know before the user knows that something is wrong,” said Henri van den Bulk, an enterprise architect with Denver Water. The IT team grappled with how they could move to a more predictive approach. Classical monitoring tools required an investment in more IT infrastructure, and the division had limited resources in terms of budget and people.
The division settled on Splunk software, which ingests large volumes of machine data and provides the analytics to make sense of the data in order to pinpoint problems. Splunk Enterprise collects, indexes and harnesses machine data generated by applications, servers and devices, whether they are physical, virtual or in the cloud. Splunk also troubleshoots application problems and investigates security incidents rapidly, helping organizations avoid service degradation or outages.
“I told people this is [like] a Google for our IT machine data,” van den Bulk said, describing how the software lets the IT staff search for problems, correlate data from different events, and predictively look at data and trends. For instance, they can use the analytical software to study trends in certain application and system errors, such as whether they are occurring during certain time periods or when there are a lot of workloads in the system. The division may not be dealing with petabytes of data, but even 100 gigabytes a day is a lot to consume.
Not long ago, van den Bulk turned one of the application teams loose with Splunk, saying: “Here’s your data, try to figure out how you want to use it.” They became creative in mining the data and found a problem where users were having application issues but had never logged it in or called the help desk.
And they went a step further, after realizing that they didn’t know if customers were using the features they built into applications. Using Splunk analytics they were able to determine the core sets of features the users were applying and those that were being ignored. Now, when the team is in the planning stage they can focus on how the users are actually using applications instead of building features that go untouched, van den Bulk said.
The IT division has a blend of applications that need to be monitored and maintained, including asset management, customer information, geospatial, mobile and work management applications. The team has begun to integrate all of the systems so admins can get a transactional view of what’s happening with applications via Splunk. A work order generated for a technician in the field to service a water meter at a customer location goes through an elaborate process in several systems. The order might have been generated with geospatial data and passed to the asset management system and then to the work management system, which delivers the order to the technician in the truck. Splunk is being used to monitor this workflow from end-to-end, van den Bulk said. In most cases, business users are not aware if things are not working because IT administrators receive alerts and proactively start managing the systems.
Splunk works with a concept called forwarders, which are lightweight pieces of code optimized to reduce the impact on host servers. They grab machine data and forward it to indexers, where the bulk of the work happens. This is where administrators can run analysis, queries and graphing. This is different from applying software agents to machines, which is a more heavy-handed approach, van den Bulk said. Splunk is more lightweight with some sophisticated capabilities to reduce overhead.
Now that the IT team has shown how large data sets can be collected, aggregated and analyzed, Denver Water officials are exploring how Splunk and other analytic tools can be applied to manage power and water consumption, an important resource to track since Colorado has experienced periods of drought. Over the next five years there will be a convergence of machine data and business data, van den Bulk said. “Machine data tells you what is going on; business data provides the context of what is going on,” he said.