Cloud based infrastructure and IT self service models have made it incredibly easy for developers and operations to spin up servers and meet fluctuating business needs. This elasticity is great for business agility and is a quintessential DevOps goal, but there is a dark side of easily obtained infrastructure that can work against company efficiency.
For many businesses, their cloud infrastructure needs are lumpy. There are distinct peaks in compute needs that can be mapped to any number of reasons; a marketing campaign goes live and causes a spike in web traffic, an analytics or simulation project must scale to crunch data, or perhaps an event occurs, such as a training or user conference, where additional scalable infrastructure is needed.
Often the focus is preparing for this spike and avoiding the associated risks of failure. And following this spike, businesses are often busy responding to the outcomes the spike drove such as more leads, adjusting development, or preparing for the next event. It’s during this time when businesses are exposed to unnecessary costs from resources that were fully utilized but have since become unused.
We were recently contacted by a large software vendor to address the specific problem of unused infrastructure. By automating the process of decommissioning servers after use, the company hopes to reduce a significant amount of wasted spend. The idea is to provide users with the option of extending a lease if needed, but otherwise to automatically go through a process of dropping the compute needs after a certain date.
A similar problem occurs in businesses which have distinct phases. An organization may have very different compute needs in the research and development phase, vs the build or execution phase. Avoiding being locked into application licenses or cloud infrastructure for the long term is essential to the business agility of these organizations. Whether it is three months or three years, unused infrastructure can take a toll on resources and negate savings and opportunities gained from the initial elasticity.
Let’s explore some areas in an AWS based cloud infrastructure that are bound to increase your bills:
In creating AWS cost management strategies, one of the first features to analyze is instance sizing. Chances are that you’re using an over-sized instance for your business requirements. While it may have been handy at the hour of need, running the instance at other times when not in use could easily add up to your AWS bills.
Consider another scenario of unused instances. To put in use the scalability features of a cloud architecture, you spin up a lot of instances to ensure high availability. Some questions to ask here are:
Is there a business need to run all instances?
Can you turn off some instances that are not used while you’re asleep?
What are the instances that are left behind unused?
One of the most attractive cloud benefits includes the autoscaling feature which in turn keeps in check the elasticity of your cloud infrastructure. You can set predefined rules that decide when to spin up new instances and when to shut them down. However, setting up autoscaling policies with no proper analysis of the business needs could lead to increased bills. It calls for load testing of the application setting proper CPU thresholds and hence appropriate autoscaling policies.
AWS Storage Resources:
Storage services are essential to any application. Some interesting AWS storage solutions include
S3: Highly durable and scalable cloud storage
Glacier: Secure and durable cloud archive storage; Usually used to store data archives, logs or in simple terms data that is not frequently accessed.
Elastic Block Store (EBS): block level storage volumes used within an AWS instance
While the list does not end there, it’s significant to understand that these services are offered on a pay-per-use model. Hence it calls for the need of considering cost management strategies. Consider the following set of questions to help analyse your AWS storage solutions usage:
Are there any oversized EBS volumes?
Are there unused EBS volumes and snapshots lying behind unnecessarily increasing your costs?
Can you push the data that you don’t use frequently to Glacier?
AWS offers several attractive support plans features ranging from round the clock support service, support forum, technical support through email, phone and chat, response time ranging between less than 15 minutes to roughly 12 hours depending on the support plan, third party software support and so on. A wise thing to do is to analyze the need for AWS support. In cases wherein you use only few of the AWS services or say you raise tickets only once or twice a month, selecting a higher plan is going to be futile.
One of the solutions to tackle such unused cloud infrastructure is to setup appropriate resource tracking using AWS services like CloudWatch or third party tools like Cloudcheckr. Such tools can be used for
Analytics and Reporting
However, to solve this challenge, a broad base of skills are required that are not available in most enterprises or startups.
First, in-depth knowledge of the cloud architecture is required. This is difficult, complex technology that is based on, but differs from, hardware concepts that will be familiar to most operations professions.
Second, software development skills must be put into play in order to create the actual script the automation will rely upon.
Third, the domain-specific knowledge of AWS that certified AWS architects have will be required - a box we are happy to tick as AWS Advance Tier partners.
For more blogs related to Service Catalog, click the button below
Did you find this useful?
Interested in getting tips, best practices and commentary delivered regularly? Click the button below to sign up for our blog and set your topic and frequency preferences.