AI in data center (Gorodenkoff/Shutterstock.com)

The smarter city: Las Vegas tunes IT operations with AIOps

From smart lights in parks to sensors that can detect potholes before they happen, Las Vegas is leveraging a wide range of connected technologies to transform itself into one of the smartest cities in the United States. To reach its goal, the city began applying machine learning and advanced analytics to the data coming from its IT operations, using AIOps, or artificial intelligence for IT operations.

AIOps platforms, according to Gartner, “combine big data and AI or machine learning functionality to enhance and partially replace a broad range of IT operations processes and tasks, including availability and performance monitoring, event correlation and analysis, IT service management, and automation.”

Las Vegas is leveraging FixStream’s AIOps platform to locate problems in IT systems before they cause downtime. It works in both traditional IT environments like Oracle’s enterprise resource and planning system and with the ServiceNow cloud-based help-desk platform the city uses.

When hardware and software systems fail, we want to know “where to go solve these issues promptly,” Director of Technology and Innovation Michael Sherwood told GCN. As Las Vegas builds out its smart city infrastructure, “we are not going to have the ability to have systems down for hours, especially in the future, so we need to get the hardware and software services back operating as soon as possible.”

For the past nine months, Sherwood said, his team has been working with FixStream to monitor systems are used to carry out city business, like Oracle ERP system that processes city’s bills.

“FixStream provides us with insights into our network, intelligence about our operational practices and gives us a complete scope of view or sphere of influence over all of our network assets,” Sherwood said.

For now, the AIOps platform is used to monitor critical business systems in Las Vegas, but Sherwood said he sees the possibility of expanding coverage to network infrastructure and smart city components such as traffic signals.

The FixStream system is designed to complement the Cisco’s Smart+Connected Digital Platform for internet of things sensors Las Vegas announced it would use in July 2017. The Cisco sensors collect data on lighting, traffic and environmental conditions that is then used to help make business decisions.

“Cisco is a major player in our IT transformation," but it is focused on gathering and analyzing data from its own platform that the city can use. It doesn't address processes within the city's IT networks, "which is where FixStream comes in,” Sherwood said. “If something goes wrong in one of our different systems, FixStream [can] tell us where it is broken so we can go out and repair that system and bring it back online.”

FixStream can operate on-premise or in the cloud. When it starts working with an enterprise, the company begins with auto-discovery of the IT environment to learn all of the different systems on the networks and create a topology of the entire IT infrastructure.

Next, the platform uses machine learning to correlate and map the relationships among different data resources, devices and applications. It then uses that information to identify patterns and anomalies that help it predict when and where trouble spots might appear.

“The FixStream platform works through correlating end-to-end systems across all entities, devices and data across the entire IT stack including business transactions from orders and invoices all the way down to servers and compute network and storage,” FixStream Chief Marketing Officer Enzo Signore said. “We can stitch the information together end to end and apply machine learning capabilities so we can find patterns and predict outages across the entire stack.”

Sherwood said his city’s journey with FixStream is still about trying to understand how the various system processes work before adding in machine learning and artificial intelligence to track abnormalities and take action.  However, Sherwood has high hopes for what the AIOps technology will bring to the city’s enterprise IT environment.

“We are looking at expanding it to all areas of operations [including] our custom programming and cloud services,” Sherwood said. AIOps will allow the city to spend less time monitoring and managing applications so it can devote its resources to "creating and delivering new services,” he said.

Other companies (such as CA Technologies and Splunk) also offer AIOps services, but the technology is not widely used by government agencies, according to Charley Rich, a research director in Gartner’s IT Operations Management Group.

AIOps uses three basic machine learning techniques to analyze information coming from IT operations, Rich said: clustering, anomaly detection and causality.

Clustering takes the data on applications, availability, response times and transactions related to IT events, analyzes for patterns and filters out false alarms, reducing the signal-to-noise ratio, which cuts down the amount of data IT teams have to manage.  

Anomaly detection looks at data over time and learns to spot deviations from normal patterns. The system would learn, for example, that operations during peak hours will differ from those on the weekend, Rich said. Over time, the algorithm learns what normal behavior looks like and so can spot irregularities.

Causality allows AIOps to determine the root cause of a failure by examining the time, location and interdependencies related to a problem, speeding troubleshooting and the problem's resolution.  

AIOps is “early in the maturity model” and will grow as more enterprises learn about the technology, Rich told GCN. “Many of our government clients are more conservative and not as far along in their IT maturity journey as some sectors, but they want to learn more about what AIOps does, how to prepare for it and how to grow the skills within their workforces.”

To help government agencies become comfortable with the technology, Gartner recommends a four-phase approach:

Establishment phase: Selecting a small number of key business applications, assessing existing skill sets and taking an inventory of existing data sources.

Reactive phase: Building a semi-structured historical database, implementing visualization and natural language tools to access data and deploying statistical analysis.

Proactive phase: Implementing streaming data ingestion, using predictive analytics to anticipate incidents and engaging in root-cause analysis of complex problems.

Expansion phase: Expanding functionality to 20 or so business applications and sharing data and analysis with IT processes outside of IT operations.

 

inside gcn

  • machine learning

    Mitigating the risks of military AI

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group