Skip to content

Kubernetes, CoreOS, and many lines of Python later.

Several months after my last post, and lots of code hacking, I can rebuild CoreOS-based bare-metal Kubernetes cluster in roughly 20 minutes. It only took  ~1300 lines of Python following Kelsey Hightower’s Kubernetes the Hard Way instructions.

Why? The challenge.

But really, why? I like to hack on code at home, and spinning up a new VM for another Django or Golang app was pretty heavyweight, when all I needed was an easy way to push it out via container. And with various open source projects out on the web providing easy ways to run their code, running my own Kubernetes cluster seemed like a no-brainer.

From github/jforman/virthelper:

First we need a fleet of Kubernetes VM’s. This script builds 3 controllers (corea-controller{0,1,2}.domain.obfuscated.net) with static IPs starting at 10.10.0.125 to .127, and 5 worker nodes (corea-worker{0,1,2,4,5}.domain.obfuscated.net) beginning at 10.10.0.110.

These VMs use CoreOS’s beta channel, each with 2GB of RAM and 50GB of Disk.

$ ./vmbuilder.py --debug create_vm --bridge_interface br-vlan10 --domain_name domain.obfuscated.net --disk_pool_name vm-store --vm_type coreos --host_name corea-controller --coreos_channel beta --coreos_create_cluster --cluster_size 3 --deleteifexists --ip_address 10.10.0.125 --nameserver 10.10.0.1 --gateway 10.10.0.1 --netmask 255.255.255.0 --memory 2048 --disk_size_gb 50 

$ ./vmbuilder.py --debug create_vm --bridge_interface br-vlan10 --domain_name domain.obfuscated.net --disk_pool_name vm-store --vm_type coreos --host_name corea-worker --coreos_channel beta --coreos_create_cluster --cluster_size 5 --deleteifexists --ip_address 10.10.0.110 --nameserver 10.10.0.1 --gateway 10.10.0.1 --netmask 255.255.255.0 --memory 2048 --disk_size_gb 50

Once that is done, the VMs are running, but several of their services are erroring out, etcd among them. Why? They use SSL certificates for secure communications among the etcd notes, and I decided to make that part of the below kubify script. I might revisit this later since one should be able to have an etcd cluster up and running without expecting Kubernetes.

Carrying on….

From github/jforman/kubify:

$ /kubify.py --output_dir /mnt/localdump1/kubetest1/ --clear_output_dir --config kubify.conf --kube_ver 1.9.3

Using the kubify.conf configuration file, this deploys Kubernetes version 1.9.3 to all the nodes, including Flannel (inter-node network overlay for pod-to-pod communication), the DNS add-on, and the Dashboard add on, using RBAC. It uses /mnt/localdump1/kubetest1 as the destination directory on the local machine for certificates, kubeconfigs, systemd unit files, etc.

Assumptions made by my script (and config):

  • 10.244.0.0/16 is the pod CIDR. This is the expectation of the Flannel Deployment configuration, and it was easiest to just assume this everywhere as opposed to hacking up the kube-flannel.yml to insert a different one I had been using.
  • Service CIDR is 10.122.0.0/16.

Things learned:

  • rkt/rktlet as the runtime container is not quite ready for prime time, or perhaps its warts are not documented enough. rktlet/issues/183 rktlet/issues/182
  • kubelets crash with an NPE kubernetes/issues/59969
  • Using cfssl for generating SSL certificates made life a lot easier than using openssl directly. There are still a ton of certificates.
  • Cross-Node Pod-to-Pod routing is still incredibly confusing, and I’m still trying to wrap my head around CNI, bridging, and other L3-connective technologies.

End Result:

$ bin/kubectl --kubeconfig admin/kubeconfig get nodes
NAME STATUS ROLES AGE VERSION
corea-controller0.obfuscated.domain.net Ready <none> 6h v1.9.3
corea-controller1.obfuscated.domain.net Ready <none> 6h v1.9.3
corea-controller2.obfuscated.domain.net Ready <none> 6h v1.9.3
corea-worker0.obfuscated.domain.net Ready <none> 6h v1.9.3
corea-worker1.obfuscated.domain.net Ready <none> 6h v1.9.3
corea-worker2.obfuscated.domain.net Ready <none> 6h v1.9.3
corea-worker3.obfuscated.domain.net Ready <none> 6h v1.9.3
corea-worker4.obfuscated.domain.net Ready <none> 6h v1.9.3

 

$ bin/kubectl --kubeconfig admin/kubeconfig get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-dns-6c857864fb-tn4r5 3/3 Running 3 6h 10.244.7.3 corea-worker4.obfuscated.domain.net
kube-system kube-flannel-ds-dlczz 1/1 Running 2 6h 10.10.0.127 corea-controller2.obfuscated.domain.net
kube-system kube-flannel-ds-kc45d 1/1 Running 0 6h 10.10.0.125 corea-controller0.obfuscated.domain.net
kube-system kube-flannel-ds-kz7ls 1/1 Running 2 6h 10.10.0.111 corea-worker1.obfuscated.domain.net
kube-system kube-flannel-ds-lwlf2 1/1 Running 2 6h 10.10.0.113 corea-worker3.obfuscated.domain.net
kube-system kube-flannel-ds-mdnv8 1/1 Running 0 6h 10.10.0.110 corea-worker0.obfuscated.domain.net
kube-system kube-flannel-ds-q44wt 1/1 Running 1 6h 10.10.0.112 corea-worker2.obfuscated.domain.net
kube-system kube-flannel-ds-rdmr5 1/1 Running 1 6h 10.10.0.114 corea-worker4.obfuscated.domain.net
kube-system kube-flannel-ds-sr26s 1/1 Running 0 6h 10.10.0.126 corea-controller1.obfuscated.domain.net
kube-system kubernetes-dashboard-5bd6f767c7-bnnkm 1/1 Running 0 6h 10.244.1.2 corea-controller1.obfuscated.domain.net

 

%d bloggers like this: