Clearly, security is the single-most important factor in designing the perfect data center. You’ve got to shield your business from attacks from the outside. But once that’s out of the way, you still face a world of attacks from the inside—the normal demands put on your data center by your customers and employees.
This truth is greater than ever before, thanks to digital transformation. Organizations of all sizes are experiencing a growing dependency on a number of applications to keep them competitive, and even more, are moving to the cloud.
This can be a double-edged sword. More apps mean greater agility, but also more points for failure. In a survey by HPE, 45% of enterprises reported running 500 or more business applications. These are complex webs we’re weaving.
In a less complex era, IT could afford to adopt a reactive strategy: solve problems once they emerge. That standard needs to change, now that we’re in an era in which a data center outage could mean a massive amount of core functionality going offline. Instead of reacting, enterprises need to think about adapting. They need to prevent problems before they emerge, and continuously improve architectures and configurations. Relying on Band-Aid solutions might not cut it anymore.
Let’s break this down and look at three major non-security issues that are faced by modern IT, and then get to some idea of how to move forward.
Unplanned downtime is disastrous. First of all, it’s bad for the reputation of your business. In May 2017, British Airways made headlines around the world when a data center failure resulted in more than 400 flight cancellations and 75,000 grounded passengers. That’s not exactly good press.
While most unplanned outages don’t result in grounded passengers, most do result in grounded data and the hefty costs. This can mean lost business from customers being unable to purchase your services and more lost revenue from employees being unable to work. On top of that, it takes time and expense to figure out the sources of the outage and the costs of new equipment.
According to a Ponemon Institute study from 2016, each data center outage costs its enterprise an average of $740,257, between lost revenue, recovery, equipment costs, etc. This isn’t an uncommon event. In 2013, Ponemon reported that 91% of respondents had some sort of unplanned outage. The ubiquity of outages suggests that there’s a need for an entirely new approach to data center management.
The App-Data Gap
Outages aren’t the only problem, though. Even if your system stays up, there’s no guarantee of achieving peak performance. When an application is not performing as expected, finding the cause(s) of the performance problem is a complicated affair. There are many variables between the app and the data. This application is suffering from the app-data gap.
This has obvious consequences: your customers will buy less stuff if your systems run slowly, and your team won’t work as quickly. But there are also some less-obvious consequences: the app-data gap can limit innovation, by encouraging companies to stick to tried-and-tested apps, rather than experimenting with new apps that might lead to performance issues.
There’s a tendency to blame the sluggish performance on storage, which entails a simple solution: throwing money at the problem by buying faster stuff. But HPE’s research has shown that 54% of app-data issues are not directly caused by storage. The issues could be a result of any number of interlocking problems, from best practice errors to the configuration, to interoperability snags. Maybe multipathing isn’t set up correctly. Maybe you’ve got under-provisioned hosts or incorrect virtual network configurations.
The point is, that the cause of app-data gaps can be tough to diagnose, let alone fix. This doesn’t mean support is incompetent; the complexity of app-data issues means that they’re inherently difficult. What it means is that there’s got to be a way to leverage a deeper understanding of the whole system.
Struggling For Support
In absence of that, triage can take a tremendous amount of time. According to the Enterprise Strategy Group, 50% of enterprises using a traditional data center wait 1-4 days for serious data issues to be diagnosed and resolved. Moreover, that’s after vendors have been contacted. Figuring out which vendor is responsible isn’t always easy. Is it hardware? Software? Connectivity? This is why 65% of those firms have dedicated storage teams, who spend their time fine-tuning data center issues as well as dealing with vendor support. This leads to an average of $240,401 in OpEx, just for storage. There has to be a better way.
There are a few ways forward. One way is to buy more hardware, establish more protocol, train more staff, and hope for the best. But given the budget outlays involved in that, and the limitations created by human error, it’s not necessarily the most efficient route.
Another idea presents itself: using the magic of AI (Artificial Intelligence) to enhance the data center. This type of machine learning presents several advantages. It can mine huge amounts of historical data to attain deep knowledge of data center pain points, and create detailed metrics for what causes storage issues. That means solving some of them before they start, and making it easier to determine the nature of others before they happen. With AI and predictive analytics, data centers can rely less on support and empower those teams for problems where they’re required.