Real-life example: why our insurance client ventured to transfer
For our customer, a global insurance service provider, we prepare the on-premises infrastructure for automatic apps’ deployment and multi-cluster system maintenance. Initially, the client utilized the hybrid model, combining cloud and on-premises data centers approximately equally. Over time, they concluded that using on-premises servers is much more favorable, and it’s not only about finances. Here are some reasons worth mentioning.
Infrastructure availability and management
Let’s remember the story when Microsoft and other big tech names decided to leave China. Those using Microsoft Azure had to urgently look for other cloud service providers. Another thing is that services may be unavailable due to malfunctions or technical works – that’s something you are not able to influence. For big companies with high-load applications this downtime might be critical, here’s why they look for alternatives in the form of on-premises data centers. When you have several own data centers (or rented racks at data centers), you can easily switch to spare capacities and thereby avoid downtimes.
Of course, no one says that there are no malfunctions or force majeure possible in a data center. It can burn down (as it happened several years ago in Europe), require electrical wiring changes or other maintenance works. On such an occasion companies usually have several data centers and use capacities of all of them. If failure occurs with one DC – all others will function in a regular mode. Sure thing, your service will run slower, but the operability will remain, and downtimes at least won’t threaten you.
Data security and confidentiality
Companies operating with sensitive information less commonly rely on the responsibility of third parties and prefer to keep their data closer to themselves. Insurance service providers store and manipulate arrays of personal and financial data of their customers, which turns even minimal data leaks into a disaster and undermines reputation. Data centers provide the opportunity to work with data, store it as safely as possible, and minimize data vulnerability. Therefore, many businesses prefer security provided by their own data centers over the convenience they can gain with the cloud.
When comparing on-prem vs cloud, the most critical factor to weigh is potential expenditures associated with any of the options. Initial procurement of hardware for your own data center may seem to cost a fortune. But let’s consider the example of our insurance customer. Necessary hardware costs them approximately USD 500,000, and they have around 70 servers. Taking note of the computing capacities they require, the payback period will be about five years. In seven years the hardware will need partial replacement, and full replacement in ten years. And voila – they enjoy at least two years of server usage for free.
Cloud – on-prem data centers transition: main intricacies
With all its virtues, an on-prem data center is not as easy as pie to handle. If you consider on-prem vs cloud, there are several things worth keeping in mind before you start the transition.
Infrastructure maintenance is totally different
When you manage your infrastructure in the cloud, there’s no need for a profound understanding of how clusters function in Kubernetes – you just need to know which buttons to push to start a cluster and make it work as if by magic. If you do it in your own data center, there’s not just a button you need to push – you must understand clusters’ working principles, which programs you can run, and what manages what. For instance, you use Kubespray, a ready-made solution by which you start a cluster through Ansible. Sure thing, you can do it practically blindfold, and initially, it may even seem that everything works just fine. But in fact, there’s no chance that production will function properly in such a case. Any arising issue – and you have no idea how to fix it precisely because you have no clue what happens under the hood.
Let’s consider one simple example. Cloud providers normally offer backup services. You may configure the backup as you want – manually or automatically, and there’s no need to worry about data retention. The most common mistake made by even experienced IT specialists, who previously worked with the cloud but have never dealt with on-premises data centers – they do backups to the same server. It doesn’t take long to understand what happens in the event of a force majeure. The server dies – the backup dies too.
To avoid such incidents, when planning a cloud - on-prem data center transit, it’s better to allocate approximately 6 months for your IT and DevOps specialists for training. It’s not recommended to abandon cloud at one time – all can work simultaneously at on-prem and in the cloud, so the staff could understand how to properly organize the infrastructure in an on-prem data center and delve into all the intricacies of correct infrastructure management.
Infrastructure as Code (IaC) writing
When you follow the IaC approach in the cloud, you use Terraform to write the result you expect to gain, and everything is done automatically. When arranging IaC in a data center – such a trick wouldn’t work. In the on-premises infrastructure, we use Ansible instead of Terraform, where we write not what we want to gain in the result, but what must be done for the result. So the approach is totally opposite and, according to the law of meanness, takes much longer. Also, if a failure occurs, fixing will be labor - and time-consuming, since you’ll have to do it manually.
Other non-obvious but common factors depending on experience
There are things, seemingly negligible that may entail a few days of downtime. They are not obvious and many DevOps engineers learn the hard way. For example, Kubernetes services communicate through SSL certificates. As the majority of certificates, Kubernetes ones are valid for a year, and after this period they must be updated. What happens if you inadvertently forget to do this simple action? A cluster stops responding and you lose control of it. Thereafter certificate updating turns into a true adventure, which implies taking multiple inconvenient actions and long-term downtime in case you do something wrong.
The transition from the cloud to on-premises data centers – is a systematic work for the future. If you have a sustainable business, operate large volumes of data, your company has been on the market for years and has more or less clear perspectives for the future, the transition to an on-premises infrastructure may be a reasonable way to reduce costs and operate your data with the utmost security. Yes, it’s much easier to maintain the cloud infrastructure, but at the moment there already are tools helping to maintain the on-premises infrastructure more conveniently. Say, MaaS (Metal as a Service), an open-source tool enabling cloud-style provisioning for physical servers.
But what should be kept in mind when comparing on-prem vs cloud, is that transition is not something you can do immediately, even PoC creation takes a while. Companies making such decisions should be thoroughly prepared for the long transition period and all potential challenges they will necessarily experience on the way.