The Business Case for Kubernetes | stack.io

Kubernetes is (still) the hip new thing: developers love it, DevOps types rant about how great it is, and everyone wants it on their resume. Sometimes it’s hard to separate out what it actually does from all the hype. Once you get past all the buzzwords, is there any tangible financial benefit for companies that use it? Is there a business case for switching over?

What is “Kubernetes” exactly?

At its most basic level, Kubernetes is a scheduler. If you have a cluster of servers, Kubernetes will schedule jobs and programs that need to be run on these servers the way a host/hostess might seat guests at a restaurant. They try to fill tables as efficiently as possible, and ensure that the individual servers get an equal workload. Big groups of guests get the bigger tables they need, and hopefully everyone who wants to go to the restaurant is able to get a seat.

Kubernetes works the same way: each program that you want to run will declare what it needs to run, and based on this information, Kubernetes will run the program on one of your servers for you. Of course, there’s a lot more to running a business than that: how do requests from the internet make it to your programs? How do you monitor what your programs are doing? Where is your data stored? These are complex questions, and sometimes the entire ecosystem of tools surrounding Kubernetes is sometimes referred to as “Kubernetes” itself. In order to be clear, when we talk about “Kubernetes” in this article, we are talking about Kubernetes itself plus the ecosystem of tools surrounding it.

Of course, even if you’re already convinced of its technical advantages, do these benefits outweigh the cost of a migration to Kubernetes? Switching to Kubernetes costs money. There is no way around this – your staff need to spend time rewriting things and learning the new ecosystem. Their salaries cost money. Hiring a third-party to speed the migration up also costs money. So what’s the business case for switching over?

The cost of failure

Computers break all the time (the author of this article wouldn’t have a job if this wasn’t the case!). Sometimes a hard drive fails, making the data on it unrecoverable. Either through human error or hardware failure, servers will go down, often never to come back online again. In rare cases, a critical piece of networking infrastructure might get eaten by the elusive “North American Fiber-Seeking Backhoe”, rendering an entire datacentre offline. Regardless of the cause – these outages cost money. Losing customer data or missing out on revenue while your site is offline is never cheap. This is the cost of failure.

Kubernetes is built with hardware failure in mind. Individual programs, servers, and even datacentres are expendable. If a program crashes for any reason, Kubernetes will start another copy of it automatically. If a server in a Kubernetes cluster becomes unreachable, Kubernetes will reschedule the programs running on it to another, “healthy” server. If a datacentre goes down, Kubernetes will treat it as if multiple servers are down – your workload will be moved to servers that are still reachable.

Of course, some things like databases aren’t expendable. Those can have terabytes of irreplaceable data on them – you can’t just spin another one up in seconds if one breaks. Although some people do run databases on Kubernetes, we recommend having stand-by database replicas in another datacentre for if your primary database breaks. This process is even better if failover to the replica is automatic. We recommend using managed database services that, as the name suggests, manage this process for you: having a secondary database replica in a different datacentre with automatic failover is as simple as checking a box in the AWS dashboard. Likewise, using multiple load balancers in multiple datacentres means that your networking gear stops being a single point of failure – if one goes down, the others keep serving traffic.

By using Kubernetes to host your applications and pairing it with failure-tolerant datastores and networking gear, you can ensure that if something breaks, it will often fix itself on its own. The self-healing nature of Kubernetes clusters means that you’ll have fewer outages, and those that do happen will be shorter. Generally speaking, switching to Kubernetes will help you lose less money when things go wrong.

This failure tolerance can also save you money. Some cloud providers like AWS have cheaper, but unreliable instance types like “spot instances”. “Spot instances” are servers that no one else is using, so AWS gives them to you at a bargain rate until someone else needs them. Once someone else wants them, AWS gives you two minutes of warning before it terminates your workload and gives the server to someone else. Though it would be insanity to use spot instances for traditional workloads, Kubernetes lets you safely use these types of servers without service interruption when one gets taken away from you. This added flexibility lets you save money by using these cheaper servers.

“But does it scale?”

This is always a fun question to ask, and also just happens to be number six on this list of ten tricks to appear smart during meetings. The infamous “does it scale?” question is worth addressing here because Kubernetes makes scaling your business effortless.

Adding a server to your Kubernetes cluster increases the amount of resources available to your applications. Scaling up is a one-step process: just create a new server and it will get used. Likewise, you can drain and turn off servers when you don’t need them (and the programs will be moved off safely without causing an outage). In fact, this process is easy to automate: if you need more servers to handle increased load, Kubernetes can automatically provision new ones, and destroy them when they’re no longer needed. You can do this with applications too – Kubernetes can monitor how much CPU your programs are using and spawn more copies if things get busy. Need to manually scale an application quickly? It’s just a single command:

kubectl scale deployment/software-name --replicas=number-of-copies-you-want

Using Kubernetes with automatic scaling, means that your infrastructure costs will change from a fixed rate, to paying for only what you use. This often comes with substantial savings: you no longer need to have twenty servers running “just in case” you get that big burst of traffic: you can just pay for just the servers you actually need, and then when a traffic spike hits, you’ll automatically have as much capacity as you need to meet demand. In addition to the flat out cost savings from only paying for what you use, this also means that as your business grows, your infrastructure grows with it (without needing someone to “scale it up” manually). Alternatively, in the entirely hypothetical scenario that there’s a global pandemic and business dries up, your infrastructure will automatically scale itself down if it's no longer needed. You can even save money by having your development environments automatically turn themselves off over the weekend if no one’s using them.

Switching to Kubernetes means that you’ll only pay for what you need, and as your business grows or shrinks, your infrastructure will scale with it gracefully and effortlessly.

A more productive development team

Deploying a new version of your software can take a long time, and probably involves multiple developers, system administrators, as well as technical and managerial approvals. In most cases, deploying a new version incurs downtime or extra pay for staff while they update things during maintenance windows in the middle of the night. Kubernetes makes deployments effortless and with zero downtime. Though every business has different needs, many deployment workflows can be replaced with a single command:

kubectl set image deployment/software-name software-name=software-name:version

This performs a graceful deployment of a new software version without any downtime. The simplicity and automation of deploying new software versions to Kubernetes means that your developers and system administrators no longer need to spend hours deploying software at convenient times for your business – you can now deploy as often as you want without downtime.

Normal software development workflows also have a cost: in an age of microservices, you often need to buy developers expensive workstations so that they can reproduce your infrastructure on their laptops. This doesn’t even address the significant amount of time it takes to set up these development environments and troubleshoot issues that crop up because your developers aren’t working in the same environment as your production workloads. Fortunately, there are tools that help with this: for instance, Microsoft’s “Bridge to Kubernetes” tool lets your developers run and debug programs directly on a Kubernetes cluster without impacting other developers. This means you’ll see fewer bugs cropping up due to differences in your development and production environments and makes the development workflow significantly faster and easier. There’s no need to spend time troubleshooting a local development setup if you can reuse the cloud setup that already works.

If your developers suck, they will continue to suck. However, moving to Kubernetes means that they will suck harder, faster, and possibly more creatively than ever before. Time is money, and Kubernetes’ excellent development tooling means that your developers will have more time to focus on actually developing the software that runs your business.

Better cost tracking and accountability

Sometimes saving money isn’t the only goal. Often your goal might just be to have a better view into your infrastructure spend. What parts of your business cost the most money? Are there any services that are wasting money?

Though Kubernetes itself doesn’t help with cost tracking and auditing, we like using the open-source tool Kubecost to monitor how much each piece of infrastructure costs. Each piece of software on Kubernetes gets automatically assigned a set of labels and can be manually tagged as well. Kubecost breaks down how much each piece of your infrastructure costs based on these labels. This lets you track how much each part of your business costs and lets you hold your tech team accountable for what is spent on infrastructure each month. Even better, Kubecost also provides automatic cost-savings suggestions that you can choose to apply to save money. I won’t lie to you and say these suggestions should be followed without question (AI still has a long way to go…), but often these automated recommendations are a great starting point for cost-optimization discussions with your tech team. Tools like Kubecost ensure that all parts of the business have insight into what money is getting spent where, and why.

Summing things up

Switching to a new tech stack costs money, and Kubernetes is no exception. However, the improvement in tooling may end up saving your business money. By using Kubernetes, you can ensure that:

You’ll have fewer outages due to hardware failure, and those that do happen will often fix themselves without human intervention.
Kubernetes lets you save money on infrastructure by allowing you to safely use cheaper and more unreliable server types like AWS Spot Instances.
Kubernetes lets you save money on infrastructure by only paying for what you need. As your business grows (or shrinks), your infrastructure will resize with it.
Your developers can spend more time developing. A lot of tasks that used to be done manually like software deployments become trivial to automate on Kubernetes. You’ll also see fewer bugs due to differences between developer workstations and your production environment – Kubernetes-friendly developer tooling lets your developers work in the same environment that your software gets deployed in.
Kubernetes has excellent cost-tracking tooling that lets you monitor how much each part of your business costs. Instead of needing to always take someone else’s word for how much things cost, there are user-friendly tools like Kubecost that let you directly track and audit your infrastructure spend.

Is it worth making the switch? We’ll let you decide - but if you do decide to migrate, Stack.io can help you save time and money by doing things right the first time, and you’d get every advantage mentioned here from the very start.

Blog