As I said above, Kubernetes is a separate ecosystem that needs to be dealt with and additional tools connected to it. If you take Self-Hosted, you will have to do all this yourself.
All tools that complement Kubernetes are Open Source solutions that need to be configured. You will need to install a monitoring system in the cluster, implement load balancing, collect and store logs, configure security and user authorization, networks, etc.
For example, you will need to monitor both the cluster itself and its applications. Moreover, standard monitoring through Zabbix is not enough for you; you will need a specific one – Prometheus or Telegraph.
With logs, the situation is similar: out of the box; you will receive only the log history for already running applications; it will disappear when redeploying. Manually collecting logs from Kubernetes will not work; you need to connect log collectors like Fluentd and a storage system like Elasticsearch or Loki. You will have to deal with load balancing separately: you will need a fault-tolerant balancer like MetalLB.
Storage Systems For Self-Hosted Kubernetes Are Another Headache
Kubernetes was initially designed for Stateless applications – they do not store anything inside containers. When working with Stateful applications that store data, the question arises of connecting external storage.
The easiest option, which is often resorted to, is to raise one NFS server, but this is a solution for the poor: it will not provide high availability and data safety.
Big problems can arise if production services with important data go to slow and unreliable NFS.
For the regular operation of the application without changing its logic, Persistent Volumes are needed – storage associated with pods. They are connected inside containers as local directories, allowing the application to store data “under itself”. Among the working options are CephFS, Glusterfs, FC (Fiber Channel), and a complete list of storage systems can be found in the official documentation.
Integrating Kubernetes with Persistent Volumes is not a trivial task. To deploy the same Ceph, it is not enough to take the manual from Habr and execute several commands. Plus, someone should deal with storage systems in the future – again, a separate engineer is needed, or even several.
If the Self-Hosted cluster is deployed not on hardware but on virtual machines in the cloud, everything is a little simpler – you don’t need to raise your own Ceph cluster. You can take a storage cluster from a provider and teach it how to work with a K8s bunch if the provider is ready to provide you with an API to their storage system, which is not available everywhere. In this case, you will have to write the integration yourself.
True, you can rent object storage or a cloud DBMS from IaaS providers, but only if the application logic allows them. And in Managed Kubernetes solutions, Persistent Volumes are already integrated out of the box.
Also Read: How Online Data Analysis Works – OLAP And Multidimensional DBMS