Faster Delivery = Happy Users
Automated Process = Fewer Errors
Standards = Cost Reduction
Order Visibility = Confidence
Linking Systems = Efficiency
As a cloud consultant, I’ve had the opportunity to see dozens of different public cloud implementations – the good, the bad and the ugly.
Regardless of where you are on your journey, every organization faces challenges – whether unexpected costs, unplanned outages or security-related issues. In my experience, most are avoidable with a little extra planning.
As you invest more time and treasure in the cloud, I recommend the following to build the foundation to keep your cloud strategy moving in the right direction.
Most IT organizations are under intense pressure to provide cloud-based resources as quickly as possible. For the sake of speed, relatively junior people, or those who aren’t accountable for cost overruns, are given blanket rights to deploy whatever they want. I’ve seen organizations incur unnecessary costs due to people standing up workloads they don’t need. I’ve also seen outages occur because someone with universal access has inadvertently shut down infrastructure.
For most, privileges should be based on the minimum requirements to do their job and nothing more. In some cases, that means being able to stand up a workload or create a database – full stop. It takes a little extra time and forethought but abiding by the principle of ‘least privilege’ is one of the foundational practices I recommend to anyone looking to save time, money and grief.
APIs are integral to life in the cloud. They enable functions and allow systems to share information. But if not handled properly, they pose a significant risk to your enterprise.
All too often API keys are stored in scripts or programs. This is of particular concern when developers use publicly accessible services like GitHub to manage code development. All it takes is someone outside your organization getting their hands on an API key to access your environment. If the owner of the key happens to have universal admin rights, they can then do whatever they like with your cloud infrastructure. I’ve seen organizations on the receiving end of substantial bills because an unscrupulous individual has used their infrastructure to add computing muscle for Bitcoin mining, among other activities.
I recommend never assigning an API key to anyone who doesn’t need it and for those who do, ensure you’re practicing the principle of ‘least privilege.’ That way, unless they have universal admin rights, you’ll limit the damage. You can also rotate API keys the same way you rotate your passwords. Better yet, by using IAM Roles in Amazon Web Services or Key Vault in Azure, you can generate temporary credentials at random, making it almost impossible for API keys to fall into the wrong hands.
I’ve met with many CTO’s who express frustration with the unchecked growth of their cloud spending. More often than not, we discover that IT is adding cloud resources to support application development and then failing to decommission them after the completion of the project.
So-called ‘compute sprawl’ is one of the biggest contributors to cost-overruns. Abiding by the principle of ‘least privilege’ gives you a head start by only entrusting those to deploy workloads who can be held accountable.
Even so, you need a tool that can continuously monitor cloud usage. The Softchoice Cloud Dashboard is a great resource because it allows you to track consumption in real-time by department, project and individual user. It’s also free!
Documented policy and governance for how cloud resources are requested and retired are perhaps the best medicine of all. For example, when requesting resources, the amount and lifespan for cloud-based compute should be specified at the outset. As you near the end of the project lifespan, reach out to see if an extension is required, and, if not, decommission the infrastructure at the agreed upon date. Many governance activities can be automated using APIs, saving time in the long run.
You can also tag resources to delineate between production as well as testing and development environments. Tagging resources allows you to generate reports that give a snapshot of your infrastructure and zero in on areas with low utilization so you either look for opportunities to consolidate or shut them down altogether.
Avoiding the most common mistakes isn’t that complicated. But it does take time and a little foresight. If you’ve got a best practice or a question, I’d love to hear it. Please feel free to share in the comments.