Strategies for keeping VM sprawl from taking over your agency
When military forces go on a mission, they do so with controlled, strategic approaches designed to leave nothing to chance. Action is targeted, contained and precise.
If only the same could be said about agency IT networks.
To be fair, managing federal IT networks has always been a monumental task. They have traditionally been massive, monolithic systems that require significant resources to maintain.
One would think that this situation would have improved with the advent of virtualization, but the opposite has proved to be true. In many agencies, the rise of the virtual machine has led to massive VM sprawl, which wastes resources and storage capacity because of a lack of oversight and control over VM resource provisioning. Left unattended, VM sprawl can wreak havoc on network performance and, in some instances, can result in serious complications -- from degraded network and application performance to network downtime.
VM sprawl is the result of multiple factors. Specifically, oversized VMs that were provisioned with more resources than necessary can waste storage and compute resources. Overallocation of RAM can also cause ballooning and performance degradation, while idle VMs are literal wastes of space that take up computer and storage resources.
There are two ways to successfully combat VM sprawl. First, administrators should put processes and policies in place to prevent it from happening. Even then, however, VM sprawl may occur, which makes it imperative that administrators also establish a second line of defense that keeps it in check during day-to-day operations.
Let’s take a closer look at strategies that can be implemented during Phase One -- the process phase -- and Phase Two -- the operational phase.
The best way to get an early handle on VM sprawl is to define specific policies and processes. This first step involves a combination of five different approaches, all designed to stop VM sprawl before it has a chance to spread.
- Establish role-based access control policies that clearly articulate who has the authority to create new VMs. These can be applied to both individuals or, if necessary, entire departments. This strategy can greatly reduce the creation of unnecessary rogue VMs and snapshots.
- Allocate resources based on actual utilization. VM sprawl is caused by poorly sized resource allocation. Implement a policy where resources are frequently and automatically monitored and adjust allocation as necessary.
- Challenge oversized VM requests. Following on from points one and two, administrators should demand justification for any potentially oversized VMs.
- Create standard VM categories. While not all VMs are created equal or easily defined, creating standard categories can help filter out abnormal or oversized VM requests. Categories could include storage, databases, applications, etc.
- Implement policies regarding snapshot lifetimes. Snapshots are meant to be temporary, but too often they run for days, sometimes even weeks, continuing to use vital resources and degrading application performance. Implementing a set timeframe for snapshot deletion can help keep those pests from living beyond their usefulness.
Unfortunately, VM sprawl can occur even if these initial defenses are put in place. Therefore, it’s incumbent upon IT teams to be able to maintain a second layer of defense that addresses sprawl during operations.
Consider a scenario where a project is cancelled or delayed. Perhaps the IT team behind that project readied a batch of VMs in anticipation of the work to come, but were uninformed when the entire thing was scuttled. Those VMs may continue to sit idle, eating up precious resources.
Or, think about what happens in an environment where storage is provisioned. Capacity may appear to be freed up at the data-store level after a VM is moved around or deleted; however, the storage will not be released on the array. This can lead to out-of-storage conditions.
During operations, it’s important to use an automated approach to virtualization management that employs predictive analysis and reclamation capabilities. Using these solutions, federal IT managers can tap into data on past usage trends to optimize their current and future virtual environments. Through predictive analysis, administrators can apply what they’ve learned from historical analysis to address issues before they occur. They can also continually monitor and evaluate their virtual environments and get alerts when issues arise so problems can be remediated quickly and efficiently.
While each of these strategies by themselves can be effective in controlling VM sprawl, together they create a complete and sound foundation that can greatly improve and simplify virtualization management. They allow administrators to build powerful, yet contained, virtualized networks.
Joe Kim is executive vice president engineering and global CTO at SolarWinds.