The Limitations of Docker, The Problems Kubernetes Solves, and Their Approaches
Docker has been a game-changer for running applications in containers, allowing developers to package apps and run them anywhere, solving the problem of apps working on one machine but not another. However, as containers grew popular, Docker faced challenges with managing them on a large scale, particularly with scheduling, networking, and storage. Docker provided some solutions, but they weren't comprehensive.
This is where Kubernetes comes in. Kubernetes excels at managing containers across multiple machines, especially for scheduling, networking, and storage. Unlike Docker, which tries to solve problems within its own platform, Kubernetes is designed to work across different cloud environments. It uses standards like the Container Network Interface (CNI) for networking, Container Runtime Interface (CRI) for running containers, and Container Storage Interface (CSI) for storage.
Automation is crucial in managing containers, especially for large systems. Manual management is impossible at the scale companies like Google operate, serving billions of users. Kubernetes, an open-source tool, helps automate container management, making it easier to handle failures and scalability.
The Lack of a Scheduler and Orchestrator in Docker
Docker was initially designed for desktop use, making it great for running containers on a single computer. However, in large production environments with thousands of containers across many virtual machines (VMs), Docker struggled. There was no built-in way to manage which VM should run each container, keep containers healthy, or manage the whole Docker cluster. This required a separate "manager" with an API to interact with the Docker daemons on different machines.
Kubernetes addresses these issues by providing advanced orchestration. It efficiently allocates containers to VMs based on available resources and workload needs. Kubernetes also monitors container health and recovers them automatically, ensuring high availability. The Kubernetes control plane centralizes management, simplifying interactions with the cluster, no matter how many machines are involved.
Problems with Networking in Docker
Imagine running a Flask application in one container and a MongoDB database in another. If the Flask app gets too much traffic, you might need to run more Flask containers. But how do users connect to these multiple Flask containers?
A load balancer is the solution, but Docker doesn’t have a built-in one. This makes it hard to scale Flask containers and ensure reliability. Also, Docker’s proxy for forwarding traffic to containers doesn’t automatically update when new replicas are added or if one goes down.
To fix this, external solutions like Traefik proxy are used with Docker to manage multiple containers and handle load balancing. However, installing and managing Traefik adds complexity.
Kubernetes solves these problems by providing built-in solutions for load balancing, container communication, and service discovery. Kubernetes can distribute traffic across multiple containers, making scaling easier. It uses cross-host networks and the CNI specification for communication between containers on different VMs. Kubernetes also offers a DNS-based service discovery, allowing services to be found using names, just like websites.
Problems with Storage Management in Docker
When running a database or other stateful component in Docker, it stores data in a specific location on the VM. If the VM fails, someone needs to manually fix the problem, which can be time-consuming and prone to errors. Kubernetes automates this process, handling storage failures automatically.
Kubernetes doesn’t have its own storage solution but uses the Container Storage Interface (CSI) specification. This allows developers to create storage solutions that work with Kubernetes, ensuring compatibility across different cloud environments.
The Future of Docker
Kubernetes was initially focused on managing containers in production, but it has grown to support entire build and deployment systems, like CI/CD pipelines. Platforms like OpenShift, based on Kubernetes, include features for build jobs and deployments. Lightweight versions like microk8s and k3s allow developers to run Kubernetes on their laptops.
While Kubernetes doesn’t build container images, tools like buildkit and Kaniko can be used, and even GitLab can be run in Kubernetes. This means organizations can now use Kubernetes for the entire software development process.
Though Docker was once the go-to tool for containers, other solutions are now often better for containerization and deployment. Docker remains popular for developers’ desktops and development environments, especially because it supports operating systems like Windows and macOS, unlike Kubernetes, which requires Linux.
The Evolution of Docker, Kubernetes, containerd, and the Birth of the Container Runtime Interface
Container technology has evolved significantly, largely due to Docker and Kubernetes.
Docker - Build, Ship, and Run Any App, Anywhere
Docker didn’t invent containers, but it made them easy to use. Before Docker, technologies like Unix chroot, FreeBSD jails, Solaris Zones, and Linux Containers (LXC) provided isolation for running processes, but they weren’t user-friendly. Docker, introduced in 2013, simplified container usage, allowing developers to easily build, share, and run containers. It quickly became the standard for container runtimes.
The Birth of Kubernetes and its Initial Dependency on Docker
As Docker usage grew, managing containers at scale became challenging. Google, drawing on its experience with a project called Borg, introduced Kubernetes in 2014. Kubernetes provided a platform to orchestrate and manage containers, and it quickly became popular.
Initially, Kubernetes relied on Docker as the container runtime, but as other runtimes like rkt, cri-o, and containerd emerged, Kubernetes needed to support them as well.
The Birth of containerd
Docker was more than just a container runtime; it included features for networking, storage, and image building, which could complicate things in a Kubernetes environment. To address this, Docker began separating its components, leading to the creation of containerd, a simpler container runtime.
In 2017, Docker donated containerd to the Cloud Native Computing Foundation (CNCF), making it a key part of the open-source container ecosystem. By April 2017, Docker released a refactored runtime built on containerd.
The Birth of Container Runtime Interface (CRI)
To make Kubernetes independent of any specific container runtime, the Kubernetes community introduced the Container Runtime Interface (CRI) in 2016. CRI allows Kubernetes to use different container runtimes without needing to be recompiled, promoting flexibility and interoperability.
With CRI in place, Kubernetes began supporting various container runtimes, including containerd, which became fully compatible with CRI by 2018.
The Deprecation and Removal of Docker Support From Kubernetes
In 2020, Kubernetes announced it would deprecate Docker as a runtime in version 1.20 and remove it entirely in version 1.22, which happened in April 2022 with version 1.24. This caused some confusion, but Kubernetes was only removing Docker as a runtime, not as a tool for building images. CRI-compatible runtimes like containerd or CRI-O are now used to run containers in Kubernetes.
Docker’s role in popularizing containers and Kubernetes’ contribution to cloud-native technologies have been foundational to the cloud-native ecosystem’s growth.
To Kubernetes or Not to Kubernetes - The Container Orchestration Wars
When Kubernetes was introduced in 2014, the container orchestration landscape was highly competitive, with solutions like Docker Swarm, AWS ECS, VMware VIC, Mesosphere, and Red Hat OpenShift all vying for dominance.
Docker Swarm provided native clustering and scheduling, but as Kubernetes gained popularity, Docker started supporting it, eventually including Kubernetes support in Docker Desktop by 2018.
AWS initially offered ECS but later introduced Elastic Kubernetes Service (EKS) in 2017, recognizing Kubernetes’ growing popularity.
VMware also shifted towards Kubernetes with its Tanzu portfolio, moving from a VM-based approach to a modern container-based architecture.
Mesosphere transitioned from their own platform, Mesos, to embrace Kubernetes with their Konvoy distribution.
Microsoft launched Azure Kubernetes Service (AKS) to simplify Kubernetes deployment and operations on Azure.
Red Hat’s OpenShift, which initially used a different orchestrator, switched to Kubernetes starting with OpenShift 3, combining Kubernetes’ power with developer and operations tools.
Pivotal Cloud Foundry, a popular PaaS product, also integrated Kubernetes to modernize its offerings, leveraging Kubernetes for container orchestration.
Today, Kubernetes is the de facto standard for container orchestration, unifying the approach to cloud-native technologies across the industry. Its rise from a new player to the leading container orchestration platform highlights the dynamic nature of the tech industry and the influence of the open-source community.