Container Orchaestration
Container/pod operations: Docker vs Kubernetes vs Rancher
- Kubernetes largerly use kubectl, kubeadm.
- Docker also have Swarm and Enterprise Edition that does Orchaestration
- Rancher 1.0 use its own system, 2.x can use kubernetes.
- Singularity is not included at this time as it does not have build-in orchaestration.
Docker/EE k8s Rancher
==================== ====================== ===================
version kubectl version
env info docker info kubectl cluster-info rancher environment
Get image from registry docker pull
List images docker images -a
Remove images docker rmi IMG
Build image docker image build
Push img to registry docker image push
Start a container/pod docker run kubectl run rancher stack create
Restart/Upgrade a pod docker start kubectl apply -f yml rancher up
List running pods docker ps kubectl get pods rancher ps
Stop container docker stop
Clear container docker rm NAME kubectl delete deployments --all (?)
Get container's logs docker logs kubectl logs rancher logs
Shell into container docker exec -it kubectl exec -it rancher exec -it # all docker stuff, -it for interactive terminal
Terms/TLA
CSI: Container Storage Interface. eg. Kubernetes driver that allocate storage directly from Proxmox VE.
PV Persistent Volume
PVC: Persistent Volume Claim. Think of PVC as the backend volume that satisfy the PV request of a Pod.
PVE: Proxmox VE (Virtual Environment)
Kubernetes
minikube start # single node for very simple dev, http://localhost:8080
systemctl start rke2-agent # rke2 is another kubelet instead of minikube, good for production use
kubectl version
kubectl cluster-info
kubectl cluster-info dump
source <(kubectl completion bash) # enable bash autocompletion #># maybe add to .bashrc
kubectl run hello-minikube --image=k8s.gcr.io/echoserver:1.4 --port=8080
kubectl expose deployment hello-minikube --type=NodePort
kubectl delete pod hello-minikube # stop the pod
kubectl get pod
curl $(minikube service hello-minikube --url)
kubectl get pods -A -o wide | grep Running
kubectl get pods -A -o wide | grep -i nvidia-gpu-driver
# see docker process/container that is running inside the pod (eg VirtualBox)
eval $(minikube docker-env)
docker ps
minikube dashboard # will launch a browser to eg
http://192.168.99.100:30000/#!/overview?namespace=default
minikube ip
minikube service list
minikube stop
kubeadm
# equiv of docker exec into a container and get a shell
kubectl exec -it -n gpu-operator nvidia-gpu-driver-ubuntu24.04-69c78df9cf-2fw48 -- /bin/bash
Random kubernetes command, sort/classification TBD
kubectl get cm # ConfigMap
kubectl get sc # storage class
kubectl edit cm -n gpu-operator kernel-module-params # bring up EDITOR to edit config
kubectl get certificates -A
kubectl get certificaterequest -A # # see automated request and if approved…
kubectl get pods -n kube-system
kubectl logs -n gpu-operator nvidia-gpu-driver-ubuntu24.04-6cfdb6bc89-bwgcv # nvidia-gpu-driver is the pod that actually build driver. if fail, would see error here
rke2 certificate check --output table
on a node
journalctl -u rke2-agent -f # this will follow startup procedure of a node.
kubectl get cm -n csi-wekafs -o yaml # output in yaml format, configMap, in this case has info on ca.crt in it
kubectl drain n0001 --ignore-daemonsets --delete-emptydir-data
# rancher drain sometime still result pods running on a node.
Storage Class
helm list -A
helm get values -n csi-wekafs csi-wekafs
kubectl get cm -n csi-wekafs # cert stuff under cm?
kubectl get sc -A
kubectl describe sc storageclass-wekafs-dir-api
kubectl get sc storageclass-wekafs-dir-api
kubectl edit sc storageclass-wekafs-dir-api # if don’t edit, will exit and save no changes detected and won't trigger anything ##storageclass-wekafs-dir-api.yaml ?
kubectl create -f tinho6-pvc-testonly-v2.yaml # persistent volume claim only. submitted, but not acted on till there is "user" of it?
kubectl apply -f tinho6-pvc-testonly-v2.yaml # apply is strange for me. should have been smart way to avoid name conflict, but haven't gotten it to work.
kubectl get pvc -n tinho
kubectl create -f tinho5_nginx_weka_v6.yaml # create pod using yaml, apply in theory work better "smartly" handling name conflict from already used name.
kubectl get pod -n tinho
Weka helm, deployment changes on CA handling --allowinsecurehttps
helm show values csi-wekafs/csi-wekafsplugin > helm_show_values_csi-wekafsplugin.yaml
# ^^ these are "factory" values
helm get values csi-wekafs -n csi-wekafs --all > helm_values_csi-wekafs--all.yaml
# this should fetch current values for the csi,
# omitting --all would exclude what the default values are (9 fewer entries)
kubectl describe -n csi-wekafs deployment/csi-wekafs-controller | grep -i insecure
kubectl edit storageclass-wekafs-dir-api
# check and change storageclass of weka (where --allowinsecurehttps was added)
kubectl create -f tinho6-pvc-testonly-v4.yaml
# make request for a simple PersistentVolumeClaim
# if this works, then https api call works
kubectl get storageclass # list all csi
kubectl get storageclass storageclass-wekafs-dir-api
kubectl describe sc storageclass-wekafs-dir-api # details of the storage class
kubectl get secret csi-wekafs-api-secret -n csi-wekafs -o yaml --template={{.data.endpoints}}
helm list -A
helm repo list
helm list -n csi-wekafs
helm upgrade csi-wekafs csi-wekafs/csi-wekafsplugin \
-n csi-wekafs \
--reuse-values \
--set pluginConfig.allowInsecureHttps=true \
--set pluginConfig.encryption.allowEncryptionWithoutKms=false
kubectl describe -n csi-wekafs deployment/csi-wekafs-controller |grep -i insecure
kubectl edit storageclass-wekafs-dir-api
ingress stuff, tbd
# bash_history
joe /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx.yaml
helm install rke2-ingress-nginx --values values.yaml --name ingress-metallb
kubectl get ingress -A
ETCD
kubectl get pods -n kube-system -l component=etcd
export ETCDCTL_ENDPOINTS='https://127.0.0.1:2379'
export ETCDCTL_CACERT='/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt'
export ETCDCTL_CERT='/var/lib/rancher/rke2/server/tls/etcd/server-client.crt'
export ETCDCTL_KEY='/var/lib/rancher/rke2/server/tls/etcd/server-client.key'
export ETCDCTL_API=3
find /var/lib/rancher/rke2 -name etcdctl
[root@control0 cert_csi-wekafs_fix]# /var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/164/fs/usr/local/bin/etcdctl member list -w table
[root@control0 cert_csi-wekafs_fix]# /var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/164/fs/usr/local/bin/etcdctl endpoint health -w table
/var/lib/rancher/rke2/bin/etcdctl endpoint status -w table
helm
# backup helm values. should have kept the original files, but if prev admin didn't hand them over
# fetch what can be from existing kube
helm get values rke2-ingress-nginx -n kube-system > helm_get_values_rke2-ingress-nginx.yaml
helm get values -n kube-system ingress-public > helm_get_values_ingress-public.yaml
helm get values -n kube-system ingress-metallb > helm_get_values_ingress-metallb.yaml
cp -pi /root/ingress-metallb/values.yaml ./ingress-metallb-values.yaml
cp -pi /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx.yaml .
kubectl describe sc storageclass-wekafs-dir-api > kubectl_desc_sc_storageclass-wekafs-dir-api.txt
helm show values csi-wekafs/csi-wekafsplugin > helm_show_values_csi-wekafsplugin.yaml
# ^^ these are "factory" values
helm get values csi-wekafs -n csi-wekafs --all > helm_values_csi-wekafs--all.yaml
# this should fetch current values for the csi,
# omitting --all would exclude what the default values are (9 fewer entries)
helm install ...
helm upgrade ...
helm uninstal ...
rancher cluster
kubectl get nodes
kubectl describe node n0004
kubectl get nodes -o go-template='{{range .items}}{{$node := . }}{{range .status.conditions}}{{$node.metadata.name}}{{": "}}{{.type}}{{":"}}{{.status}}{{"\n"}}{{end}}{{end}}' | grep -i network
kubeadm cluster
2021.09
instructions from Kubernetes.io instruction on installing a cluster:
Official setup
doc
Kubectl
-------
This is command line tool to interact with cluster.
Should be easy to install, for Linux, etc.
https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/
Kubelet
-------
This is the daemon process that run on the worker nodes.
Think slurmd on compute notes here.
Inbound ports:
- 10250 kubelet API
- 30000-32767 NodePort Services†
There are version requirements:
- Kubelet can NOT be newer than KubeAdm.
- Kubelet can be 1 minor release behind KubeAdm.
Kubeadm
-------
Control workers, create cluster. the real orchestrator.
Can get RHEL, Debian binaries. Google host a yum repo for rpm.
Non-package manager binaries available via github.
Etcd... API server...
Network
-------
additional vlan creation manageable via kubeadm would be nice for growth.
But for a small/static deployment, switch control by kubeAdm not needed.
A private network still need to be setup and used by kubernetes.
Container runtimes
Docker used to be the standard. But in 2020.12, the "dockershim" is being deprecated by kubernetes.
Kubernetes 1.20 still supports it, with a warning.
Kubernetes 1.22 (released 2021-08) maybe last version supporting dockershim.
Kubernetes 1.23? will only support CRI at that point? Will docker support CRI by this point?
Another company will provide a dockershim-ed version of kubernetes, at support expense?
CRI-O maybe the new container substrate Google is pushing for.
podman does not work with Kubernetes, so RHEL 8 maybe an issue.
Should be able to install Docker via non OS provided rpm.
containerd, supported. Developed by Docker, use OCI image format, and supports CRI.
Docker currently comes with this (in addition to docker), and it is the only "dual container" env that Kubeadm will support without erroring out during install.
Singularity not mentioned in Kubernetes.io site.
Sylabs doc
says it support is via CRI standard interface. See
See Diff b/w Docker, containerD, CRI-O and runc at Tutorial Works on the many nuance of the container stack.
Ref
2020:
- What is Kubernetes? concise intro by RH
- Kubernetes clustering choices
Kubeadm (Baremetal on CentOS)
Fedora multi-node
- Create Custom Kubernetes Cluster from Scratch
- Kubespray: Kubernetes On-prem and cloud (as opposed to Kops or Kubeadm)
- Large (100+) deploymnents of K8s recommendations.
- Kubernetes on DC/OS (Medosphere)
-
-
Rancher
- Rancher is open source, Free to run, Premium for support service.
A Guide to Kubernetes with Rancher (sale brochure from 2021) page 11 states: Commitment-Free Open Source. No different binary for Free vs Enterprise. Just pay for support. (install over existing kubernetes cluster, so just provide GUI, view, management, but not an actual kubernetes cluster?
- RanchOS is a lightweight OS for hosting containers.
- Install trivially as a docker container.
# create an app stack (list of containers) to be run together
# StackName is a directory containing docker compose yaml definition file listing all necessary containers
rancher stack create StackName
# redeploy a running application as per new spec (eg version upgrade)
rancher up --force-upgrade -d --stack StackName
# confirm upgrade (why was this needed?)
rancher up --confirm-upgrade -d --stack StackName
# look at logs of serverice. web is the name of a service in the demo stack
rancher logs StackName/web
# scale the service "web" to have 2 instances:
rancher scale StackName/web=2
# get shell into container (will ask when when there are 1+ instance)
# largely same as docker exec
rancher exec -it StackName/web /bin/bash
Container Landscape
cncf.io lanscape chat circa 2018.
(They have "interactive" view online, but that's mostly a dynamic list of links, don't give big picture view).
hoti1
bofh1