Increasing ROI starts with faster response
- By Lee Koepping
- Jan 27, 2021
As the pandemic continues, organizations remain challenged with new ways of working, communicating and delivering day-to-day services. Government IT teams, especially, face evolving obstacles and competing priorities because even seconds of network or application downtime can mean the difference between mission success and failure to deliver.
To measure downtime and response, tech teams monitor mean time to resolution. A long MTTR can threaten the agency's mission and constituent experience by disrupting accessibility and increasing costs and complexity, from the software development process to the delivery of services.
To mitigate this, IT operations teams work relentlessly behind the scenes to ensure the digital experience is reliable and resilient, addressing incidents even before users notice -- often relying on manual processes and human insight. However, the speed of today's "always-on" operations and the increasingly complex IT ecosystem can render manual processes obsolete. Agencies need automated solutions to detect, diagnose and repair issues at machine speed.
Human diagnostics are insufficient in MTTR reduction
Preventing IT disruptions is not a turnkey solution but a multistage process. Operations departments must have "eyes on" the services, applications and infrastructure. In many cases, these data points are siloed across multiple sensors, technologies and environments. Often, IT staff must keep inventory of IT and non-IT assets with visual relationship maps and alerts so they know what they have, where it is and how it’s performing.
Even with systems that can gather these critical metrics and display them, humans cannot keep up, and automation at machine speed is essential. Once considered nice-to-have or futuristic tech, automation is now essential -- even for the smallest agencies.
The role of artificial intelligence
For most organizations, the average MTTR is four hours, but for the government, that number doubles. That's because data in the federal IT world is siloed, leading to inconsistencies between departmental data. "War room" cultures, legacy mindsets and, of course, Murphy's Law also play a role.
Traditionally, when a problem arises, whether discovered by a tool or an angry customer, various teams all jump on a conference bridge to defend their systems. The last one left on the call gets left with fixing the problem.
This is, of course, unsustainable in today’s world. Agencies need a unified solution to alert them in real time, or as close to it as possible, when a service is degrading. Artificial intelligence and machine learning offer a way to reduce MTTR cycles. With ML, agencies can flag abnormal behaviors immediately and use that data to help identify the issue, its location, its potential impact, its cause and, ultimately, possible resolutions. AIOps extends this detection further, employing and expanding automation to maximize efficiency, effectiveness and resilience.
AIOps is the holy grail for improved government operations
Fortunately, AIOps is no longer limited to deep-pocketed businesses or well-resourced agencies that can create DevOps solutions and employ a team of data scientists. The IT industry has risen to the challenge, and innovative companies are offering AIOps solutions designed for specific problems that offer critical insights. They can reduce, not replace, the human workload because a person must respond and make decisions based on the data revealed. Agencies can determine when and how much to automate functions based on their needs and readiness for what can be a long journey.
The silver lining? In a journey that moves from crawling to walking to running, even small steps forward demonstrate progress.
Change by design, or by disaster
In today's world, tools and services providers must meet agencies where they are. Government IT managers should invest in tools that fit their culture and adapt to it. It is no longer enough to have the "best" widget -- consumable and flexible operations matter more.
It’s increasingly undeniable that AI and ML are the only ways to survive in today’s IT operations. Getting the right tools and ecosystem in place to harness advanced capabilities positions teams and organizations to move forward. For agencies in pursuit of AIOps, remember that it is a trust and experience journey, not a product.
Lee Koepping is principal solutions architect with ScienceLogic.