Big Data

ELASTIC Search – COOKBOOK

February 27, 2021 by Kinshuk Dutta

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java. Following an open-core business model, parts of the software are licensed under various open-source licenses (mostly the Apache License),^[2] while other parts fall under the proprietary (source-available) Elastic License. Official clients are available in Java, .NET (C#), PHP, Python, Apache Groovy, Ruby and many other languages. According to the DB-Engines ranking, Elasticsearch is the most popular enterprise search engine followed by Apache Solr, also based on Lucene.

Original author	Shay Banon talking about Elasticsearch at Berlin Buzzwords 2010
Initial release	8 February 2010
Written in	Java
License	Various (open-core model), e.g. Apache License 2.0(partially; open source), Elastic License (proprietary; source-available)
Website	www.elastic.co/products/elasticsearch

This blog has a curated list of elastic search packages and resources. It starts with how to install and then show some basic implementation and usage.

For installation I have followed the guide published on Elasticsearch website.

Install Elasticsearch on macOS with Homebrew

Elastic publishes Homebrew formulae so you can install Elasticsearch with the Homebrew package manager.

To install with Homebrew, you first need to tap the Elastic Homebrew repository:

brew tap elastic/tap

Once you’ve tapped the Elastic Homebrew repo, you can use brew install to install the default distribution of Elasticsearch:

brew install elastic/tap/elasticsearch-full

This installs the most recently released default distribution of Elasticsearch. To install the OSS distribution, specify elastic/tap/elasticsearch-oss.

Directory layout for Homebrew installs

When you install Elasticsearch with brew install the config files, logs, and data directories are stored in the following locations.

Type	Description	Default Location	Setting
home	Elasticsearch home directory or `$ES_HOME`	`/usr/local/var/homebrew/linked/elasticsearch-full`
bin	Binary scripts including `elasticsearch` to start a node and `elasticsearch-plugin` to install plugins	`/usr/local/var/homebrew/linked/elasticsearch-full/bin`
conf	Configuration files including `elasticsearch.yml`	`/usr/local/etc/elasticsearch`	`ES_PATH_CONF`
data	The location of the data files of each index / shard allocated on the node. Can hold multiple locations.	`/usr/local/var/lib/elasticsearch`	`path.data`
logs	Log files location.	`/usr/local/var/log/elasticsearch`	`path.logs`
plugins	Plugin files location. Each plugin will be contained in a subdirectory.	`/usr/local/var/homebrew/linked/elasticsearch/plugins`

Configuring Elasticsearch

Elasticsearch ships with good defaults and requires very little configuration. Most settings can be changed on a running cluster using the Cluster update settings API.

The configuration files should contain settings which are node-specific (such as node.name and paths), or settings which a node requires in order to be able to join a cluster, such as cluster.name and network.host.

Config files location

Elasticsearch has three configuration files:

elasticsearch.yml for configuring Elasticsearch
jvm.options for configuring Elasticsearch JVM settings
log4j2.properties for configuring Elasticsearch logging

These files are located in the config directory, whose default location depends on whether or not the installation is from an archive distribution (tar.gz or zip) or a package distribution (Debian or RPM packages).

For the archive distributions, the config directory location defaults to $ES_HOME/config. The location of the config directory can be changed via the ES_PATH_CONF environment variable as follows:

ES_PATH_CONF=/path/to/my/config ./bin/elasticsearch

Alternatively, you can export the ES_PATH_CONF environment variable via the command line or via your shell profile.

For the package distributions, the config directory location defaults to /etc/elasticsearch. The location of the config directory can also be changed via the ES_PATH_CONF environment variable, but note that setting this in your shell is not sufficient. Instead, this variable is sourced from /etc/default/elasticsearch (for the Debian package) and /etc/sysconfig/elasticsearch (for the RPM package). You will need to edit the ES_PATH_CONF=/etc/elasticsearch entry in one of these files accordingly to change the config directory location.

Config file format

The configuration format is YAML. Here is an example of changing the path of the data and logs directories:

path:
    data: /var/lib/elasticsearch
    logs: /var/log/elasticsearch

Settings can also be flattened as follows:

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

In YAML, you can format non-scalar values as sequences:

discovery.seed_hosts:
   - 192.168.1.10:9300
   - 192.168.1.11
   - seeds.mydomain.com

Though less common, you can also format non-scalar values as arrays:

discovery.seed_hosts: ["192.168.1.10:9300", "192.168.1.11", "seeds.mydomain.com"]

Environment variable substitution

Environment variables referenced with the ${...} notation within the configuration file will be replaced with the value of the environment variable. For example:

node.name:    ${HOSTNAME}
network.host: ${ES_NETWORK_HOST}

Values for environment variables must be simple strings. Use a comma-separated string to provide values that Elasticsearch will parse as a list. For example, Elasticsearch will split the following string into a list of values for the ${HOSTNAME} environment variable:

export HOSTNAME=“host1,host2"

Cluster and node setting types

Cluster and node settings can be categorized based on how they are configured:Dynamic

You can configure and update dynamic settings on a running cluster using the cluster update settings API. You can also configure dynamic settings locally on an unstarted or shut down node using elasticsearch.yml.

Updates made using the cluster update settings API can be persistent, which apply across cluster restarts, or transient, which reset after a cluster restart. You can also reset transient or persistent settings by assigning them a null value using the API.

If you configure the same setting using multiple methods, Elasticsearch applies the settings in following order of precedence:

Transient setting
Persistent setting
elasticsearch.yml setting
Default setting value

For example, you can apply a transient setting to override a persistent setting or elasticsearch.yml setting. However, a change to an elasticsearch.yml setting will not override a defined transient or persistent setting.

It’s best to set dynamic, cluster-wide settings with the cluster update settings API and use elasticsearch.yml only for local configurations. Using the cluster update settings API ensures the setting is the same on all nodes. If you accidentally configure different settings in elasticsearch.yml on different nodes, it can be difficult to notice discrepancies.Static

Static settings can only be configured on an unstarted or shut down node using elasticsearch.yml.

Static settings must be set on every relevant node in the cluster.

Setting JVM options

You should rarely need to change Java Virtual Machine (JVM) options. If you do, the most likely change is setting the heap size. The remainder of this document explains in detail how to set JVM options. You can set options either with jvm.options files or with the ES_JAVA_OPTSenvironment variable.

The preferred method of setting or overriding JVM options is via JVM options files. When installing from the tar or zip distributions, the root jvm.options configuration file is config/jvm.options and custom JVM options files can be added to config/jvm.options.d/. When installing from the Debian or RPM packages, the root jvm.options configuration file is /etc/elasticsearch/jvm.options and custom JVM options files can be added to /etc/elasticsearch/jvm.options.d/. When using the Docker distribution of Elasticsearch you can bind mount custom JVM options files into /usr/share/elasticsearch/config/jvm.options.d/. You should never need to modify the root jvm.options file instead preferring to use custom JVM options files. The processing ordering of custom JVM options is lexicographic.

JVM options files must have the suffix .options and contain a line-delimited list of JVM arguments following a special syntax:

lines consisting of whitespace only are ignored
lines beginning with # are treated as comments and are ignored# this is a comment
lines beginning with a - are treated as a JVM option that applies independent of the version of the JVM-Xmx2g
lines beginning with a number followed by a : followed by a - are treated as a JVM option that applies only if the version of the JVM matches the number8:-Xmx2g
lines beginning with a number followed by a - followed by a : are treated as a JVM option that applies only if the version of the JVM is greater than or equal to the number8-:-Xmx2g
lines beginning with a number followed by a - followed by a number followed by a : are treated as a JVM option that applies only if the version of the JVM falls in the range of the two numbers8-9:-Xmx2g
all other lines are rejected

An alternative mechanism for setting Java Virtual Machine options is via the ES_JAVA_OPTSenvironment variable. For instance:

export ES_JAVA_OPTS="$ES_JAVA_OPTS -Djava.io.tmpdir=/path/to/temp/dir"
./bin/elasticsearch

When using the RPM or Debian packages, ES_JAVA_OPTS can be specified in the system configuration file.

The JVM has a built-in mechanism for observing the JAVA_TOOL_OPTIONS environment variable. We intentionally ignore this environment variable in our packaging scripts. The primary reason for this is that on some OS (e.g., Ubuntu) there are agents installed by default via this environment variable that we do not want interfering with Elasticsearch.

Additionally, some other Java programs support the JAVA_OPTS environment variable. This is nota mechanism built into the JVM but instead a convention in the ecosystem. However, we do not support this environment variable, instead supporting setting JVM options via the jvm.optionsfile or the environment variable ES_JAVA_OPTS as above.

Secure settings

Some settings are sensitive, and relying on filesystem permissions to protect their values is not sufficient. For this use case, Elasticsearch provides a keystore and the elasticsearch-keystoretool to manage the settings in the keystore.

Only some settings are designed to be read from the keystore. However, the keystore has no validation to block unsupported settings. Adding unsupported settings to the keystore causes Elasticsearch to fail to start. To see whether a setting is supported in the keystore, look for a “Secure” qualifier the setting reference.

All the modifications to the keystore take effect only after restarting Elasticsearch.

These settings, just like the regular ones in the elasticsearch.yml config file, need to be specified on each node in the cluster. Currently, all secure settings are node-specific settings that must have the same value on every node.

Reloadable secure settings

Just like the settings values in elasticsearch.yml, changes to the keystore contents are not automatically applied to the running Elasticsearch node. Re-reading settings requires a node restart. However, certain secure settings are marked as reloadable. Such settings can be re-read and applied on a running node.

The values of all secure settings, reloadable or not, must be identical across all cluster nodes. After making the desired secure settings changes, using the bin/elasticsearch-keystore addcommand, call:

POST _nodes/reload_secure_settings
{
  "secure_settings_password": "s3cr3t" 
}

The password that the Elasticsearch keystore is encrypted with.

This API decrypts and re-reads the entire keystore, on every cluster node, but only the reloadable secure settings are applied. Changes to other settings do not go into effect until the next restart. Once the call returns, the reload has been completed, meaning that all internal data structures dependent on these settings have been changed. Everything should look as if the settings had the new value from the start.

When changing multiple reloadable secure settings, modify all of them on each cluster node, then issue a reload_secure_settings call instead of reloading after each modification.

There are reloadable secure settings for:

Auditing security settings

You can use audit logging to record security-related events, such as authentication failures, refused connections, and data-access events.

If configured, auditing settings must be set on every node in the cluster. Static settings, such as xpack.security.audit.enabled, must be configured in elasticsearch.yml on each node. For dynamic auditing settings, use the cluster update settings API to ensure the setting is the same on all nodes.

General Auditing Settings

xpack.security.audit.enabled

(Static) Set to true to enable auditing on the node. The default value is false. This puts the auditing events in a dedicated file named <clustername>_audit.json on each node.

If enabled, this setting must be configured in elasticsearch.yml on all nodes in the cluster.

Audited Event Settings

The events and some other information about what gets logged can be controlled by using the following settings:xpack.security.audit.logfile.events.include(Dynamic) Specifies which events to include in the auditing output. The default value is: access_denied, access_granted, anonymous_access_denied, authentication_failed, connection_denied, tampered_request, run_as_denied, run_as_granted.xpack.security.audit.logfile.events.exclude(Dynamic) Excludes the specified events from the output. By default, no events are excluded.xpack.security.audit.logfile.events.emit_request_body

(Dynamic) Specifies whether to include the request body from REST requests on certain event types such as authentication_failed. The default value is false.

No filtering is performed when auditing, so sensitive data may be audited in plain text when including the request body in audit events.

Local Node Info Settings

xpack.security.audit.logfile.emit_node_name(Dynamic) Specifies whether to include the node name as a field in each audit event. The default value is false.xpack.security.audit.logfile.emit_node_host_address(Dynamic) Specifies whether to include the node’s IP address as a field in each audit event. The default value is false.xpack.security.audit.logfile.emit_node_host_name(Dynamic) Specifies whether to include the node’s host name as a field in each audit event. The default value is false.xpack.security.audit.logfile.emit_node_id(Dynamic) Specifies whether to include the node id as a field in each audit event. This is available for the new format only. That is to say, this information does not exist in the <clustername>_access.log file. Unlike node name, whose value might change if the administrator changes the setting in the config file, the node id will persist across cluster restarts and the administrator cannot change it. The default value is true.

Audit Logfile Event Ignore Policies

These settings affect the ignore policies that enable fine-grained control over which audit events are printed to the log file. All of the settings with the same policy name combine to form a single policy. If an event matches all of the conditions for a specific policy, it is ignored and not printed.xpack.security.audit.logfile.events.ignore_filters.<policy_name>.users(Dynamic) A list of user names or wildcards. The specified policy will not print audit events for users matching these values.xpack.security.audit.logfile.events.ignore_filters.<policy_name>.realms(Dynamic) A list of authentication realm names or wildcards. The specified policy will not print audit events for users in these realms.xpack.security.audit.logfile.events.ignore_filters.<policy_name>.roles(Dynamic) A list of role names or wildcards. The specified policy will not print audit events for users that have these roles. If the user has several roles, some of which are not covered by the policy, the policy will not cover this event.xpack.security.audit.logfile.events.ignore_filters.<policy_name>.indices(Dynamic) A list of index names or wildcards. The specified policy will not print audit events when all the indices in the event match these values. If the event concerns several indices, some of which are not covered by the policy, the policy will not cover this event.

Circuit breaker settings

Elasticsearch contains multiple circuit breakers used to prevent operations from causing an OutOfMemoryError. Each breaker specifies a limit for how much memory it can use. Additionally, there is a parent-level breaker that specifies the total amount of memory that can be used across all breakers.

Except where noted otherwise, these settings can be dynamically updated on a live cluster with the cluster-update-settings API.

Parent circuit breaker

The parent-level breaker can be configured with the following settings:indices.breaker.total.use_real_memory(Static) Determines whether the parent breaker should take real memory usage into account (true) or only consider the amount that is reserved by child circuit breakers (false). Defaults to true.indices.breaker.total.limit (Dynamic) Starting limit for overall parent breaker. Defaults to 70% of JVM heap if indices.breaker.total.use_real_memory is false. If indices.breaker.total.use_real_memory is true, defaults to 95% of the JVM heap.

Field data circuit breaker

The field data circuit breaker estimates the heap memory required to load a field into the field data cache. If loading the field would cause the cache to exceed a predefined memory limit, the circuit breaker stops the operation and returns an error.indices.breaker.fielddata.limit (Dynamic) Limit for fielddata breaker. Defaults to 40% of JVM heap.indices.breaker.fielddata.overhead (Dynamic) A constant that all field data estimations are multiplied with to determine a final estimation. Defaults to 1.03.

Request circuit breaker

The request circuit breaker allows Elasticsearch to prevent per-request data structures (for example, memory used for calculating aggregations during a request) from exceeding a certain amount of memory.indices.breaker.request.limit (Dynamic) Limit for request breaker, defaults to 60% of JVM heap.indices.breaker.request.overhead (Dynamic) A constant that all request estimations are multiplied with to determine a final estimation. Defaults to 1.

In flight requests circuit breaker

The in flight requests circuit breaker allows Elasticsearch to limit the memory usage of all currently active incoming requests on transport or HTTP level from exceeding a certain amount of memory on a node. The memory usage is based on the content length of the request itself. This circuit breaker also considers that memory is not only needed for representing the raw request but also as a structured object which is reflected by default overhead.network.breaker.inflight_requests.limit(Dynamic) Limit for in flight requests breaker, defaults to 100% of JVM heap. This means that it is bound by the limit configured for the parent circuit breaker.network.breaker.inflight_requests.overhead(Dynamic) A constant that all in flight requests estimations are multiplied with to determine a final estimation. Defaults to 2.

Accounting requests circuit breaker

The accounting circuit breaker allows Elasticsearch to limit the memory usage of things held in memory that are not released when a request is completed. This includes things like the Lucene segment memory.indices.breaker.accounting.limit(Dynamic) Limit for accounting breaker, defaults to 100% of JVM heap. This means that it is bound by the limit configured for the parent circuit breaker.indices.breaker.accounting.overhead(Dynamic) A constant that all accounting estimations are multiplied with to determine a final estimation. Defaults to 1

Script compilation circuit breaker

Slightly different than the previous memory-based circuit breaker, the script compilation circuit breaker limits the number of inline script compilations within a period of time.

See the “prefer-parameters” section of the scripting documentation for more information.script.context.$CONTEXT.max_compilations_rate(Dynamic) Limit for the number of unique dynamic scripts within a certain interval that are allowed to be compiled for a given context. Defaults to 75/5m, meaning 75 every 5 minutes.

Cluster-level shard allocation and routing settingsedit

Shard allocation is the process of allocating shards to nodes. This can happen during initial recovery, replica allocation, rebalancing, or when nodes are added or removed.

One of the main roles of the master is to decide which shards to allocate to which nodes, and when to move shards between nodes in order to rebalance the cluster.

There are a number of settings available to control the shard allocation process:

Cluster-level shard allocation settings control allocation and rebalancing operations.
Disk-based shard allocation settings explains how Elasticsearch takes available disk space into account, and the related settings.
Shard allocation awareness and Forced awareness control how shards can be distributed across different racks or availability zones.
Cluster-level shard allocation filtering allows certain nodes or groups of nodes excluded from allocation so that they can be decommissioned.

Besides these, there are a few other miscellaneous cluster-level settings.

Cluster-level shard allocation settingsedit

You can use the following settings to control shard allocation and recovery:cluster.routing.allocation.enable

(Dynamic) Enable or disable allocation for specific kinds of shards:

all – (default) Allows shard allocation for all kinds of shards.
primaries – Allows shard allocation only for primary shards.
new_primaries – Allows shard allocation only for primary shards for new indices.
none – No shard allocations of any kind are allowed for any indices.

This setting does not affect the recovery of local primary shards when restarting a node. A restarted node that has a copy of an unassigned primary shard will recover that primary immediately, assuming that its allocation id matches one of the active allocation ids in the cluster state.cluster.routing.allocation.node_concurrent_incoming_recoveries(Dynamic) How many concurrent incoming shard recoveries are allowed to happen on a node. Incoming recoveries are the recoveries where the target shard (most likely the replica unless a shard is relocating) is allocated on the node. Defaults to 2.cluster.routing.allocation.node_concurrent_outgoing_recoveries(Dynamic) How many concurrent outgoing shard recoveries are allowed to happen on a node. Outgoing recoveries are the recoveries where the source shard (most likely the primary unless a shard is relocating) is allocated on the node. Defaults to 2.cluster.routing.allocation.node_concurrent_recoveries(Dynamic) A shortcut to set both cluster.routing.allocation.node_concurrent_incoming_recoveries and cluster.routing.allocation.node_concurrent_outgoing_recoveries.cluster.routing.allocation.node_initial_primaries_recoveries(Dynamic) While the recovery of replicas happens over the network, the recovery of an unassigned primary after node restart uses data from the local disk. These should be fast so more initial primary recoveries can happen in parallel on the same node. Defaults to 4.cluster.routing.allocation.same_shard.host(Dynamic) Allows to perform a check to prevent allocation of multiple instances of the same shard on a single host, based on host name and host address. Defaults to false, meaning that no check is performed by default. This setting only applies if multiple nodes are started on the same machine.

Shard rebalancing settingsedit

A cluster is balanced when it has an equal number of shards on each node without having a concentration of shards from any index on any node. Elasticsearch runs an automatic process called rebalancing which moves shards between the nodes in your cluster to improve its balance. Rebalancing obeys all other shard allocation rules such as allocation filtering and forced awareness which may prevent it from completely balancing the cluster. In that case, rebalancing strives to acheve the most balanced cluster possible within the rules you have configured. If you are using data tiers then Elasticsearch automatically applies allocation filtering rules to place each shard within the appropriate tier. These rules mean that the balancer works independently within each tier.

You can use the following settings to control the rebalancing of shards across the cluster:cluster.routing.rebalance.enable

(Dynamic) Enable or disable rebalancing for specific kinds of shards:

all – (default) Allows shard balancing for all kinds of shards.
primaries – Allows shard balancing only for primary shards.
replicas – Allows shard balancing only for replica shards.
none – No shard balancing of any kind are allowed for any indices.

cluster.routing.allocation.allow_rebalance

(Dynamic) Specify when shard rebalancing is allowed:

always – Always allow rebalancing.
indices_primaries_active – Only when all primaries in the cluster are allocated.
indices_all_active – (default) Only when all shards (primaries and replicas) in the cluster are allocated.

cluster.routing.allocation.cluster_concurrent_rebalance(Dynamic) Allow to control how many concurrent shard rebalances are allowed cluster wide. Defaults to 2. Note that this setting only controls the number of concurrent shard relocations due to imbalances in the cluster. This setting does not limit shard relocations due to allocation filtering or forced awareness.

Shard balancing heuristics settingsedit

Rebalancing works by computing a weight for each node based on its allocation of shards, and then moving shards between nodes to reduce the weight of the heavier nodes and increase the weight of the lighter ones. The cluster is balanced when there is no possible shard movement that can bring the weight of any node closer to the weight of any other node by more than a configurable threshold. The following settings allow you to control the details of these calculations.cluster.routing.allocation.balance.shard(Dynamic) Defines the weight factor for the total number of shards allocated on a node (float). Defaults to 0.45f. Raising this raises the tendency to equalize the number of shards across all nodes in the cluster.cluster.routing.allocation.balance.index(Dynamic) Defines the weight factor for the number of shards per index allocated on a specific node (float). Defaults to 0.55f. Raising this raises the tendency to equalize the number of shards per index across all nodes in the cluster.cluster.routing.allocation.balance.threshold(Dynamic) Minimal optimization value of operations that should be performed (non negative float). Defaults to 1.0f. Raising this will cause the cluster to be less aggressive about optimizing the shard balance.

Regardless of the result of the balancing algorithm, rebalancing might not be allowed due to forced awareness or allocation filtering.

Disk-based shard allocation settingsedit

The disk-based shard allocator ensures that all nodes have enough disk space without performing more shard movements than necessary. It allocates shards based on a pair of thresholds known as the low watermark and the high watermark. Its primary goal is to ensure that no node exceeds the high watermark, or at least that any such overage is only temporary. If a node exceeds the high watermark then Elasticsearch will solve this by moving some of its shards onto other nodes in the cluster.

It is normal for nodes to temporarily exceed the high watermark from time to time.

The allocator also tries to keep nodes clear of the high watermark by forbidding the allocation of more shards to a node that exceeds the low watermark. Importantly, if all of your nodes have exceeded the low watermark then no new shards can be allocated and Elasticsearch will not be able to move any shards between nodes in order to keep the disk usage below the high watermark. You must ensure that your cluster has enough disk space in total and that there are always some nodes below the low watermark.

Shard movements triggered by the disk-based shard allocator must also satisfy all other shard allocation rules such as allocation filtering and forced awareness. If these rules are too strict then they can also prevent the shard movements needed to keep the nodes’ disk usage under control. If you are using data tiers then Elasticsearch automatically configures allocation filtering rules to place shards within the appropriate tier, which means that the disk-based shard allocator works independently within each tier.

If a node is filling up its disk faster than Elasticsearch can move shards elsewhere then there is a risk that the disk will completely fill up. To prevent this, as a last resort, once the disk usage reaches the flood-stage watermark Elasticsearch will block writes to indices with a shard on the affected node. It will also continue to move shards onto the other nodes in the cluster. When disk usage on the affected node drops below the high watermark, Elasticsearch automatically removes the write block.

It is normal for the nodes in your cluster to be using very different amounts of disk space. The balance of the cluster depends only on the number of shards on each node and the indices to which those shards belong. It considers neither the sizes of these shards nor the available disk space on each node, for the following reasons:

Disk usage changes over time. Balancing the disk usage of individual nodes would require a lot more shard movements, perhaps even wastefully undoing earlier movements. Moving a shard consumes resources such as I/O and network bandwidth and may evict data from the filesystem cache. These resources are better spent handling your searches and indexing where possible.
A cluster with equal disk usage on every node typically performs no better than one that has unequal disk usage, as long as no disk is too full.

You can use the following settings to control disk-based allocation:cluster.routing.allocation.disk.threshold_enabled (Dynamic) Defaults to true. Set to false to disable the disk allocation decider.cluster.routing.allocation.disk.watermark.low (Dynamic) Controls the low watermark for disk usage. It defaults to 85%, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices but will prevent their replicas from being allocated.cluster.routing.allocation.disk.watermark.high (Dynamic) Controls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.cluster.routing.allocation.disk.watermark.enable_for_single_data_node(Static) For a single data node, the default is to disregard disk watermarks when making an allocation decision. This is deprecated behavior and will be changed in 8.0. This setting can be set to true to enable the disk watermarks for a single data node cluster (will become default in 8.0).cluster.routing.allocation.disk.watermark.flood_stage

(Dynamic) Controls the flood stage watermark, which defaults to 95%. Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node, and that has at least one disk exceeding the flood stage. This setting is a last resort to prevent nodes from running out of disk space. The index block is automatically released when the disk utilization falls below the high watermark.

You cannot mix the usage of percentage values and byte values within these settings. Either all values are set to percentage values, or all are set to byte values. This enforcement is so that Elasticsearch can validate that the settings are internally consistent, ensuring that the low disk threshold is less than the high disk threshold, and the high disk threshold is less than the flood stage threshold.

An example of resetting the read-only index block on the my-index-000001 index:

PUT /my-index-000001/_settings
{
  "index.blocks.read_only_allow_delete": null
}

Copy as cURL View in Consolecluster.info.update.interval(Dynamic) How often Elasticsearch should check on disk usage for each node in the cluster. Defaults to 30s.cluster.routing.allocation.disk.include_relocations[~~7.5.0~~] Deprecated in 7.5.0. Future versions will always account for relocations.Defaults to true, which means that Elasticsearch will take into account shards that are currently being relocated to the target node when computing a node’s disk usage. Taking relocating shards’ sizes into account may, however, mean that the disk usage for a node is incorrectly estimated on the high side, since the relocation could be 90% complete and a recently retrieved disk usage would include the total size of the relocating shard as well as the space already used by the running relocation.

Percentage values refer to used disk space, while byte values refer to free disk space. This can be confusing, since it flips the meaning of high and low. For example, it makes sense to set the low watermark to 10gb and the high watermark to 5gb, but not the other way around.

An example of updating the low watermark to at least 100 gigabytes free, a high watermark of at least 50 gigabytes free, and a flood stage watermark of 10 gigabytes free, and updating the information about the cluster every minute:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.low": "100gb",
    "cluster.routing.allocation.disk.watermark.high": "50gb",
    "cluster.routing.allocation.disk.watermark.flood_stage": "10gb",
    "cluster.info.update.interval": "1m"
  }
}

Copy as cURL View in Console

Shard allocation awarenessedit

You can use custom node attributes as awareness attributes to enable Elasticsearch to take your physical hardware configuration into account when allocating shards. If Elasticsearch knows which nodes are on the same physical server, in the same rack, or in the same zone, it can distribute the primary shard and its replica shards to minimise the risk of losing all shard copies in the event of a failure.

When shard allocation awareness is enabled with the dynamic cluster.routing.allocation.awareness.attributes setting, shards are only allocated to nodes that have values set for the specified awareness attributes. If you use multiple awareness attributes, Elasticsearch considers each attribute separately when allocating shards.

By default Elasticsearch uses adaptive replica selection to route search or GET requests. However, with the presence of allocation awareness attributes Elasticsearch will prefer using shards in the same location (with the same awareness attribute values) to process these requests. This behavior can be disabled by specifying export ES_JAVA_OPTS="$ES_JAVA_OPTS -Des.search.ignore_awareness_attributes=true" system property on every node that is part of the cluster.

The number of attribute values determines how many shard copies are allocated in each location. If the number of nodes in each location is unbalanced and there are a lot of replicas, replica shards might be left unassigned.

Enabling shard allocation awarenessedit

To enable shard allocation awareness:

Specify the location of each node with a custom node attribute. For example, if you want Elasticsearch to distribute shards across different racks, you might set an awareness attribute called rack_id in each node’s elasticsearch.yml config file.node.attr.rack_id: rack_oneYou can also set custom attributes when you start a node:`./bin/elasticsearch -Enode.attr.rack_id=rack_one`
Tell Elasticsearch to take one or more awareness attributes into account when allocating shards by setting cluster.routing.allocation.awareness.attributes in every master-eligible node’s elasticsearch.yml config file.cluster.routing.allocation.awareness.attributes: rack_id Specify multiple attributes as a comma-separated list.You can also use the cluster-update-settings API to set or update a cluster’s awareness attributes.

With this example configuration, if you start two nodes with node.attr.rack_id set to rack_oneand create an index with 5 primary shards and 1 replica of each primary, all primaries and replicas are allocated across the two nodes.

If you add two nodes with node.attr.rack_id set to rack_two, Elasticsearch moves shards to the new nodes, ensuring (if possible) that no two copies of the same shard are in the same rack.

If rack_two fails and takes down both its nodes, by default Elasticsearch allocates the lost shard copies to nodes in rack_one. To prevent multiple copies of a particular shard from being allocated in the same location, you can enable forced awareness.

Forced awarenessedit

By default, if one location fails, Elasticsearch assigns all of the missing replica shards to the remaining locations. While you might have sufficient resources across all locations to host your primary and replica shards, a single location might be unable to host ALL of the shards.

To prevent a single location from being overloaded in the event of a failure, you can set cluster.routing.allocation.awareness.force so no replicas are allocated until nodes are available in another location.

For example, if you have an awareness attribute called zone and configure nodes in zone1 and zone2, you can use forced awareness to prevent Elasticsearch from allocating replicas if only one zone is available:

cluster.routing.allocation.awareness.attributes: zone
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2

Specify all possible values for the awareness attribute.

With this example configuration, if you start two nodes with node.attr.zone set to zone1 and create an index with 5 shards and 1 replica, Elasticsearch creates the index and allocates the 5 primary shards but no replicas. Replicas are only allocated once nodes with node.attr.zone set to zone2 are available.

Cluster-level shard allocation filtering

You can use cluster-level shard allocation filters to control where Elasticsearch allocates shards from any index. These cluster wide filters are applied in conjunction with per-index allocation filtering and allocation awareness.

Shard allocation filters can be based on custom node attributes or the built-in _name, _host_ip, _publish_ip, _ip, _host, _id and _tier attributes.

The cluster.routing.allocation settings are dynamic, enabling live indices to be moved from one set of nodes to another. Shards are only relocated if it is possible to do so without breaking another routing constraint, such as never allocating a primary and replica shard on the same node.

The most common use case for cluster-level shard allocation filtering is when you want to decommission a node. To move shards off of a node prior to shutting it down, you could create a filter that excludes the node by its IP address:

PUT _cluster/settings
{
  "transient" : {
    "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
  }
}

Cluster routing settings

cluster.routing.allocation.include.{attribute}(Dynamic) Allocate shards to a node whose {attribute} has at least one of the comma-separated values.cluster.routing.allocation.require.{attribute}(Dynamic) Only allocate shards to a node whose {attribute} has all of the comma-separated values.cluster.routing.allocation.exclude.{attribute}(Dynamic) Do not allocate shards to a node whose {attribute} has any of the comma-separated values.

The cluster allocation settings support the following built-in attributes:

`_name`	Match nodes by node name
`_host_ip`	Match nodes by host IP address (IP associated with hostname)
`_publish_ip`	Match nodes by publish IP address
`_ip`	Match either `_host_ip` or `_publish_ip`
`_host`	Match nodes by hostname
`_id`	Match nodes by node id
`_tier`	Match nodes by the node’s data tier role

_tier filtering is based on node roles. Only a subset of roles are data tier roles, and the generic data role will match any tier filtering. a subset of roles that are data tier roles, but the generic data role will match any tier filtering.

You can use wildcards when specifying attribute values, for example:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.exclude._ip": "192.168.2.*"
  }
}

Miscellaneous cluster settings

Metadata

An entire cluster may be set to read-only with the following setting:cluster.blocks.read_only(Dynamic) Make the whole cluster read only (indices do not accept write operations), metadata is not allowed to be modified (create or delete indices).cluster.blocks.read_only_allow_delete(Dynamic) Identical to cluster.blocks.read_only but allows to delete indices to free up resources.

Don’t rely on this setting to prevent changes to your cluster. Any user with access to the cluster-update-settings API can make the cluster read-write again.

Cluster shard limit

There is a soft limit on the number of shards in a cluster, based on the number of nodes in the cluster. This is intended to prevent operations which may unintentionally destabilize the cluster.

This limit is intended as a safety net, not a sizing recommendation. The exact number of shards your cluster can safely support depends on your hardware configuration and workload, but should remain well below this limit in almost all cases, as the default limit is set quite high.

If an operation, such as creating a new index, restoring a snapshot of an index, or opening a closed index would lead to the number of shards in the cluster going over this limit, the operation will fail with an error indicating the shard limit.

If the cluster is already over the limit, due to changes in node membership or setting changes, all operations that create or open indices will fail until either the limit is increased as described below, or some indices are closed or deleted to bring the number of shards below the limit.

The cluster shard limit defaults to 1,000 shards per data node. Both primary and replica shards of all open indices count toward the limit, including unassigned shards. For example, an open index with 5 primary shards and 2 replicas counts as 15 shards. Closed indices do not contribute to the shard count.

You can dynamically adjust the cluster shard limit with the following setting:cluster.max_shards_per_node

(Dynamic) Limits the total number of primary and replica shards for the cluster. Elasticsearch calculates the limit as follows:

cluster.max_shards_per_node * number of data nodes

Shards for closed indices do not count toward this limit. Defaults to 1000. A cluster with no data nodes is unlimited.

Elasticsearch rejects any request that creates more shards than this limit allows. For example, a cluster with a cluster.max_shards_per_node setting of 100 and three data nodes has a shard limit of 300. If the cluster already contains 296 shards, Elasticsearch rejects any request that adds five or more shards to the cluster.

This setting does not limit shards for individual nodes. To limit the number of shards for each node, use the cluster.routing.allocation.total_shards_per_node setting.

User-defined cluster metadata

User-defined metadata can be stored and retrieved using the Cluster Settings API. This can be used to store arbitrary, infrequently-changing data about the cluster without the need to create an index to store it. This data may be stored using any key prefixed with cluster.metadata.. For example, to store the email address of the administrator of a cluster under the key cluster.metadata.administrator, issue this request:

PUT /_cluster/settings
{
  "persistent": {
    "cluster.metadata.administrator": "[email protected]"
  }
}

User-defined cluster metadata is not intended to store sensitive or confidential information. Any information stored in user-defined cluster metadata will be viewable by anyone with access to the Cluster Get Settings API, and is recorded in the Elasticsearch logs.

Index tombstones

The cluster state maintains index tombstones to explicitly denote indices that have been deleted. The number of tombstones maintained in the cluster state is controlled by the following setting:cluster.indices.tombstones.size(Static) Index tombstones prevent nodes that are not part of the cluster when a delete occurs from joining the cluster and reimporting the index as though the delete was never issued. To keep the cluster state from growing huge we only keep the last cluster.indices.tombstones.size deletes, which defaults to 500. You can increase it if you expect nodes to be absent from the cluster and miss more than 500 deletes. We think that is rare, thus the default. Tombstones don’t take up much space, but we also think that a number like 50,000 is probably too big.

If Elasticsearch encounters index data that is absent from the current cluster state, those indices are considered to be dangling. For example, this can happen if you delete more than cluster.indices.tombstones.size indices while an Elasticsearch node is offline.

You can use the Dangling indices API to manage this situation.

Logger

The settings which control logging can be updated dynamically with the logger. prefix. For instance, to increase the logging level of the indices.recovery module to DEBUG, issue this request:

PUT /_cluster/settings
{
  "transient": {
    "logger.org.elasticsearch.indices.recovery": "DEBUG"
  }
}

Copy as cURL View in Console

Persistent tasks allocation

Plugins can create a kind of tasks called persistent tasks. Those tasks are usually long-lived tasks and are stored in the cluster state, allowing the tasks to be revived after a full cluster restart.

Every time a persistent task is created, the master node takes care of assigning the task to a node of the cluster, and the assigned node will then pick up the task and execute it locally. The process of assigning persistent tasks to nodes is controlled by the following settings:cluster.persistent_tasks.allocation.enable

(Dynamic) Enable or disable allocation for persistent tasks:

all – (default) Allows persistent tasks to be assigned to nodes
none – No allocations are allowed for any type of persistent task

This setting does not affect the persistent tasks that are already being executed. Only newly created persistent tasks, or tasks that must be reassigned (after a node left the cluster, for example), are impacted by this setting.cluster.persistent_tasks.allocation.recheck_interval(Dynamic) The master node will automatically check whether persistent tasks need to be assigned when the cluster state changes significantly. However, there may be other factors, such as memory usage, that affect whether persistent tasks can be assigned to nodes but do not cause the cluster state to change. This setting controls how often assignment checks are performed to react to these factors. The default is 30 seconds. The minimum permitted value is 10 seconds.