Create a Fault-Tolerant and Scalable Elasticsearch Cluster with Bitnami and Helm

Vikram Vaswani

Vikram Vaswani


Share this article

Elasticsearch is a powerful open source search engine that makes it easy to execute both structured and unstructured searches on data. It is also highly scalable, enabling you to perform real-time searches across hundreds of thousands of documents without sacrificing speed or quality.

Bitnami now offers an Elasticsearch Helm chart that makes it easy to deploy Elasticsearch in Kubernetes-based production environments. The Bitnami Elasticsearch Helm chart configures a fault-tolerant cluster with separate master, ingest, coordinating and data nodes. This chart also follows current best practices for security and scalability.

Deployment Options

The Bitnami Elasticsearch Helm chart can be deployed on any Kubernetes cluster. With the chart, Bitnami provides two configuration files: values.yaml, which initializes the deployment using a set of default values and is intended for development or test environments, and values-production.yaml, which is intended for production environments.

The values.yaml file deploys a cluster containing 6 nodes, as follows:

  • 2 master-eligible nodes
  • 2 coordinating-only nodes
  • 2 data nodes

The values-production.yaml file deploys a cluster containing 11 nodes, as follows:

  • 2 ingest nodes
  • 3 master-eligible nodes
  • 2 coordinating-only nodes
  • 3 data nodes
  • 1 metrics-exporter node

This blog post will use the Microsoft Azure Container Service (AKS) and the values-production.yaml configuration file, but it's equally easy to deploy the Bitnami Elasticsearch chart on Google Kubernetes Engine (GKE), Amazon Elastic Container Service (EKS) or even minikube for quick testing.

Deploying the Cluster

To deploy the Bitnami Elasticsearch chart on AKS, provision a new Kubernetes cluster on Microsoft Azure and then install and configure kubectl and Helm with the necessary credentials. You will find a detailed walkthrough of these steps in our AKS guide. We recommend deploying the Elasticsearch Helm chart on a 3-node DS3_v2 cluster with at least 30GB of available disk space. However, depending on the likely workload of your cluster, you may want to use a different machine type and/or additional disks.

Once the cluster is provisioned, download the values-production.yaml file and deploy the chart using the command below:

$ helm install --name my-release -f values-production.yaml bitnami/elasticsearch

This command creates a deployment with the name my-release. You can use a different release name if you wish - just remember to update it in the previous and following commands. Monitor the pods until the deployment is complete:

$ kubectl get pods -w

Here is a sample of the command output showing the running pods:

Status

To check that everything is working correctly, use the Elasticsearch REST API. Obtain the name of a coordinating-only pod, create a tunnel to it and then retrieve a list of available nodes from the API:

$ export POD_NAME=$(kubectl get pods --namespace default -l "app=elasticsearch,release=my-release,role=coordinating-only" -o jsonpath="{.items[0].metadata.name}")
$ kubectl port-forward $POD_NAME 9200:9200 &
$ curl http://127.0.0.1:9200/_cat/nodes

If you see output similar to the image below, your cluster is good to go!

Nodes

Default Network Topology and Security

Depending on whether you use the development or production values, the Bitnami Elasticsearch deployment is configured with either 6 or 11 nodes. However, you can scale the cluster up or down by adding or removing nodes even after the initial deployment.

The Bitnami Elasticsearch chart doesn't require RBAC rules to be deployed. The cluster operates on the standard port 9200. Remote connections are enabled for this port by default, although you should remember to configure the coordinating-only nodes with the NodePort service to enable external access.

Data Replication and Persistence

A key feature of Bitnami's Elasticsearch Helm chart is that it comes pre-configured to provide a horizontally scalable and fault-tolerant deployment. When a master-eligible node fails, a new master is chosen between the remaining available master-eligible nodes.

Data persistence is enabled by default in the chart configuration. Data is persisted only on the data nodes and not on the master-eligible, ingest and coordinating-only nodes. Data nodes are deployed as a StatefulSet and a separate persistent volume is created for each pod of the StatefulSet. If the master node fails, connected applications master may experience some downtime until a new master is elected. However, there would not be be any data loss as the data is stored in a separate persistent volume.

By default, the values-production.yaml configuration file initializes an 8 GB persistent volume on each node. However, it's possible to modify the size of the disk by setting different values at deployment time, as in the example below which configures a 16 GB persistent volume on the data nodes instead:

$ helm install --name my-release -f values-production.yaml --set data.persistence.size=16Gi bitnami/elasticsearch

Horizontal Scaling

You can easily scale the cluster up or down by adding or removing nodes. The default configuration in values.yaml creates 2 deployments (for master and coordinating-only nodes) and 1 StatefulSet (for data nodes). When using values-production.yaml, the chart creates 3 deployments (for master, ingest and coordinating only nodes) and 1 StatefulSet (for data nodes)

Depending on your requirements, simply scale the corresponding deployment up or down. For example, to scale the number of coordinating-only nodes up to 5, use the command below:

$ kubectl scale deployment my-release-elasticsearch-coordinating-only --replicas=5

Elasticsearch nodes can automatically detect other nodes and coordinate between them. So, wait for the new nodes to become active and then check for their presence in the cluster using the REST API shown previously.

Metrics

If you already have Prometheus enabled in your Kubernetes cluster, the Bitnami Elasticsearch Helm chart comes preconfigured with a Prometheus exporter, making it easy to monitor the status of your deployment. To use this, deploy the chart using the command below:

$ helm install --name my-release -f values-production.yaml --set metrics.enabled=true bitnami/elasticsearch

Then, configure your Prometheus server to retrieve the metrics from the metrics node (port 9108). Here are example commands to obtain the pod name, forward the necessary port and retrieve the metrics:

$ export POD_NAME=$(kubectl get pods --namespace default -l "app=elasticsearch,release=my-release,role=metrics" -o jsonpath="{.items[0].metadata.name}")
$ kubectl port-forward $POD_NAME 9108:9108 &
$ curl http://127.0.0.1:9108/metrics

Here's an example of the output:

Metrics

See the full list of Prometheus-related parameters in the chart template.

Updates

You can update to the latest version with these commands:

$ helm repo update
$ helm upgrade my-release -f values-production.yaml bitnami/elasticsearch

If this sounds interesting to you, why not try it now? Deploy the Bitnami Elasticsearch Helm chart now on Microsoft Azure Container Service (AKS) and then tweet @bitnami and tell us what you think!