The more you’re willing to self-manage your cloud software, the cheaper the costs will be. These days, it’s actually relatively easy to use tools like Docker and Kubernetes to deploy stable infrastructure. I even use Docker and Kubernetes at home on “spellbook” Raspberry Pis for home-automation purposes. In this post, I’ll show that the same basic tools can be used to manage an AWS Kubernetes deployment for extremely cheap.
This blog is, in fact, served by the AWS-hosted Kubernetes cluster that is my “production environment.” It serves as the case-study of this post. It uses my open-source Switchboard projects to route traffic for many WordPress blogs and GRPC services at once.
But I’m getting ahead of myself. For the context of this post, let’s assume the objective is to launch something like MySQL and a WordPress container.
Want to know the exact price tag? Skip to the end of the article.
What to Buy
I like to think of the following as “commodities:”
- Computing (CPU + Memory)
- Data (Storage + Transfer)
A good rule with commodities is:
The more “raw” of a commodity you buy, the cheaper the bill will be.
Amazon will gladly let you purchase load balancers (ELB), distributed databases (RDS), and other “managed solutions.” In each of these cases, you’re paying Amazon to provide software on top of the basic computing/data commodities. Instead, I chose to use my open-source Switchboard docker container (based on Envoy) for load balancing, as well as a MySQL statefulset. Because they run in the cluster itself, costs are reduced by more than 60% as compared to paying for ELBs and RDS.
There is a time when paying for managed services makes sense. Large scale companies may not wish to hire database specialists or manage their own real-time data infrastructure. Part of my job when I was an engineering manager was to consider the “developer hours” saved by spending money for a managed solution.
However… the open-source tooling used by system administrators is becoming easier to use by the day. I would not have considered managing a “production grade” server for my side projects before Kubernetes came along. But with tools like kops, it’s really not hard to get something up and running in a way that’s both stable and cheap.
Which AWS Region?
Before setting up a cluster, you’ll need to select the region. I settled on
- It is cheaper than the west-coast servers.
- It’s not first to receive updates, like
The second point is somewhat anecdotal. I’ve heard sysadmins point out incidents caused by
us-east-1 being the first to receive
patches bugs. I would expect that they would also receive the first security patches. ¯\_(ツ)_/¯
AWS Kubernetes: EKS/Fargate vs. kops
AWS provides a managed solution called Elastic Kubernetes Service (EKS), as well as its own container management system called Fargate. I found Fargate to be expensive and clunky. EKS does “just work,” but I frankly found it easier to use the same tools I use elsewhere (i.e., kops).
There is probably a scale at which this becomes more complicated than it’s worth. But at least at the small scale, configuring the master and worker nodes is not challenging. It’s easy to allocate exactly the amount of necessary resources (compute and storage) for a well-defined cluster so as to waste little. And in my experience, auto-scaling works well enough that I could expect to receive many, many orders of magnitude more traffic to this blog before I became concerned about stability (I’d expect the single MySQL statefulset instance to fail first, at which point paying for Aurora might make more sense than becoming an expert database administrator).
I’m going to assume you’ve already deployed to AWS with kops.
kops edit ig nodes will let you configure the nodes in your cluster, where your containers actually run. Under
spec, there are several values which might be tweaked:
machineType: I eventually settled on
t3a.medium. The key difference between the
t3series is burst capacity. I found the latter more suited for my variable needs. A
t3a.smallmay suffice for single-website deployments with few bells and whistles (Tiller, observability tooling, etc.)
maxSize: how many EC2 instances? You can set both values to 1 if you’re looking for the absolute minimum cost, but this should be traded off against scaling concerns and downtime during cluster maintenance.
rootVolumeIops: how much storage?
100were enough for me.
It’s also possible to configure the master nodes with
kops edit ig [cluster-name].
machineType: I used
t3a.small, but found that I had to keep my pod count in check or else deploys would get stuck. The burst capacity of the
t3aseries is pretty useful is this case, as master nodes tend to experience highly variable load. I tried
t3a.microand while it seemed to technically “work,” it was a painful experience in practice.
minSize: can be set to 1.
10worked for me and has not been filled after the better part of a year.
If you know you’ll be running your cluster for at least a year, reserved instances are an obvious choice. They let you lock in a price by committing to a one or three year purchase. These days it’s very easy to optimize costs in this way. Simply run a cluster for 7+ days, and then use the “recommendations” tab in the AWS cost explorer.
In my case, I was able to shave off about 32% for 1-year contracts and 44% for 3-year contracts.
The graph above showed a deployment running for $0.83/day. My current setup (the server serving you this blog right now) tends to hover under $1.50/day.
What’d I miss? What can be cheaper?