After more than a decade running virtually everything as a container, in 2015, Google unveiled its long-rumored internal container-oriented cluster-management system: Borg (you can read about it here)
All Star Trek fans will identify the name with the collection of drones from the series and more precisely, their greetings:
« We are the Borg. Lower your shields and surrender your ships. We will add your biological and technological distinctiveness to our own. Your culture will adapt to service us. Resistance is futile. »
« Resistance is futile » seems to be fitting if we consider how containers have taken over the IT world. I like to think it was Google’s way to relate to the Borg’s collective (lots of independent interconnected units with specific tasks working toward a common goal). Following into the steps of the maritime trend created by Docker, Kubernetes was born.
Kubernetes (Greek for « helmsman » or « pilot ») is the next gen Borg developed in the open for the past 3 years. It has reached version 1.9.6 just a few days ago. It has become an immensely popular open-source system that provides a container-centric management environment to orchestrate computing, networking, and storage infrastructure. In the last year it raised to a defacto standard with just about every big name in tech joining the Cloud Native Computing Foundation (CNCF, which is also part of the larger Linux Foundation). If you do not know the CNCF, it was born from a partnership between Google and the Linux Foundation around the first public release of Kubernetes. Many great projects have been integrated since.
Initial release: 7 June 2014
1.0 release: 21 July 2015
Every major cloud provider offers its managed Kubernetes service: Google Kubernetes Engine, Amazon Elastic Container Service for Kubernetes, Azure Container Service. I stop there, but I could add specific offerings at Red Hat, VMWare, IBM, CoreOS, etc …
A little bit of history:
As already hinted earlier on, container tech is not so new. Many use or used it without even knowing: ever used a Google service? If you are in the IT business, you might be surprised by the timeline if you are just getting started: here is a quick rundown.
The chroot system call is developed in Unix 7 in 1979. A few years later, it will be added to BSD. The next baby steps towards containers will happen in 2000, FreeBSD introduces Jails. The jail concepts are added to the Linux kernel with Linux Vserver a year later. In 2004, Solaris shows its container implementation: Solaris Zones. In 2006, Google introduces Process Containers which will end up in the linux kernel 2 years later as what we know now as Control Groups (cgroups) . Fast forward to 2013, and Docker makes containers accessible and easy via an elegant set of tools: container popularity explodes. 2014 Docker Swarm, Kubernetes and others try to tackle the orchestration of containers. 2016, Docker Swarm is integrated in the Docker Engine and makes creating a cluster and orchestrating containers a breese. 2017, Docker announces the integration of Kubernetes into Docker Enterprise embracing the choice made by the community. 2018, the craze about containers is long gone, the existing gets improved and hardened, other projects get into the spotlight: Istio, Rook, …
An app is … a few components, that deliver a service. Typically, it was deployed on a host that had the necessary tooling and libraries for that app. VMs allowed a more efficient use of the physical hosts. Containers allowed a more efficient use of VMs.
Complexity raises when you want to deploy containers to different hosts: a ton of questions will go through your head once you get interested in containerizing an app and deploying it to a container cluster. How do you select the host? Should you select a host? With what criteria? How are the containers supposed to communicate across different VMs? What happens when a container crashes? What happens when a VM crashes? What if I want to scale a component from my app up to 2 or more instances? How can I expose and hide services? How will I reach services inside the cluster or from outside of the cluster? How can a container access data knowing they can spawn all over the place? …
CoreOS initially proposed fleet (a systemd based orchestration) before moving to Kubernetes. Docker provided Compose to orchestrate container deployments locally before extending orchestration to the cluster with Swarm (based on Compose). Rancher used Cattle, their own orchestrator based on Compose. Mesos can manage containerized and non-containerized workloads but is very complex to operate. To counter that complexity, Mesosphere supplied Marathon. The container was the smallest deployable unit.
We had opinionated choices in all of them, but many different ways to do essentially the same thing: schedule, control, scale, heal, expose services.
Comes Kubernetes and Google’s experience from running containers for a decade, plus a community striving to find a better way to orchestrate. The smallest unit deployable is a Pod (which can be one or more containers) because it makes sense grouping elements logically (counter intuitive, isn’t it?). A scheduler will place Pods on the cluster: placement like pretty much anything else will be handled through labels (metadata). It will be responsible to place Pods based on their resource requirements or other constraints. Services will expose endpoints in the cluster. A controller will take care of scaling, self healing. Deployments can be rolled back. Kubernetes can store app configurations and secrets to make them available cluster wide. Batch jobs can easily be run on the cluster. All cluster components are highly available in production conditions, so can be the services you deploy on said cluster.
A lot of ideas and approaches, simple and/or complex, good and/or bad, led to a system that found a resonance with the community and more importantly with the business.
Kubernetes is everywhere, it is slowly drifting towards the “boring” tech now that the orchestrator wars are settled, and I mean that in the best way. “Boring” here means people are now looking at building yet more abstraction layers on top of containers and orchestration. The hot topics are security, monitoring, tracing, serverless, … What we have right now is so powerful, trying it means there is no turning back. But there is more to come on many fronts and I can not wait to see how it is all going to evolve.
If you are contemplating moving to containers, you should, even your legacy apps would benefit from being containerized and gain immediate portability to where-ever containers can be run: pretty much anywhere. Simple? Maybe not. Worth it? Definitely!
If you go down the container road, orchestration will be your very next stop. There are many vendors, platforms, clouds to choose from, a lot of them decided to go with Kubernetes as orchestrator. So if you wish to invest your time efficiently and wisely, get to know Kubernetes.