An Elasticsearch index is a logical namespace that stores and organizes a collection of related JSON documents, similar to a database table in relational databases but designed for full-text search and analytics.
Each index is uniquely named and can contain any number of documents, where each document is a set of key-value pairs (fields) representing your data.
Key Features of an Elasticsearch Index
- Structure: An index is comprised of one or more shards, which are distributed across nodes in the Elasticsearch cluster for scalability and resilience.
- Mapping and Search: Indexes define mappings that control how document fields are stored and searched.
- Indexing Process: Data is ingested and stored as JSON documents in the index, and Elasticsearch builds an inverted index to allow for fast searches.
- Use Case: Indices are used to organize datasets in log analysis, search applications, analytics, or any scenario where rapid search/retrieval is needed.
In summary, an Elasticsearch index is the foundational storage and retrieval structure enabling efficient search and analytics on large datasets.
When analysing an arbitrary index, we want to know:
- its size
- shards
- their number
- allocation - on which nodes they are allocated (and allocation criteria: which node types should these shards be allocated to)
- does it have any data retention defined (Index Lifecycle Policy)
- historical rate/growth of storage usage for data
Index Lifecycle Policy (ILM)
An Index Lifecycle Management (ILM) policy defines what happens to an index as it ages — automatically. It’s a set of rules for retention, rollover, shrink, freeze, and delete.
Example:
PUT _ilm/policy/functionbeat
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": { "max_age": "30d", "max_size": "50GB" }
}
},
"delete": {
"min_age": "90d",
"actions": { "delete": {} }
}
}
}
}
This says:
- Keep the index hot (actively written to) until it’s 30 days old or 50 GB big.
- Then roll over (create a new index and switch writes to it).
- After 90 days, delete the old index.
ILM can be applied to a standard (non–data stream) index. We can attach an ILM policy to any index, not just data streams. However, there’s a big difference:
- Rollover alias required:
- Standard Index:Yes. We must manually set up an alias to make rollover work!
- Data Stream: No (handled automatically - Elastic manages the alias and the backing indices)
- Multiple backing indices
- Standard Index: Optional (via rollover)
- Data Stream: Always (that’s how data streams work)
- Simplified management
- Standard Index: Manual setup
- Data Stream: Built-in
Index Rollover vs Data Stream
If we have a continuous stream of documents (e.g. logs) being written to Elasticsearch, we should not write them to a regular index as its size will grow over time and we'll need to keep increasing a node storage. Instead, we should consider one of the following options:
- Data Stream
- Index with ILM policy which defines a rollover conditions
What does rollover mean for a standard index?
When a rollover is triggered (by size, age, or doc count):
- Elasticsearch creates a new index with the same alias.
- The alias used for writes (e.g. functionbeat-write) is moved from the old index to the new one.
- Functionbeat or Logstash continues writing to the same alias, unaware that rollover happened.
Example:
# Initially
functionbeat-000001 (write alias: functionbeat-write)
# After rollover
functionbeat-000001 (read-only)
functionbeat-000002 (write alias: functionbeat-write)
This keeps the write flow continuous and allows you to:
- Manage old data (delete, freeze, move to cold tier)
- Limit index size for performance
How to apply ILM to a standard index?
Here’s a minimal configuration:
PUT _ilm/policy/functionbeat
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": { "max_age": "30d", "max_size": "50GB" }
}
},
"delete": {
"min_age": "30d",
"actions": { "delete": {} }
}
}
}
}
PUT _template/functionbeat
{
"index_patterns": ["functionbeat-*"],
"settings": {
"index.lifecycle.name": "functionbeat",
"index.lifecycle.rollover_alias": "functionbeat-write"
}
}
The following command creates a new index called functionbeat-000001 (if it doesn’t already exist). If the index does exist, it updates the aliases section. It creates an alias named functionbeat-write that points to this index. (Aliases are like virtual index names — you can send reads or writes to the alias instead of a specific index. They’re lightweight and flexible.). is_write_index: true tells Elasticsearch: “When someone writes to this alias, route the write operations to this index.” If you later have: functionbeat-000001, functionbeat-000002 and both share the alias functionbeat-write, then only the one with "is_write_index": true will receive new documents.
PUT functionbeat-000001
{
"aliases": {
"functionbeat-write": { "is_write_index": true }
}
}
ILM rollover works by:
- Watching the alias (functionbeat-write), not a specific index.
- When rollover conditions are met (e.g. 50 GB or 30 days), Elasticsearch:
- Creates a new index (functionbeat-000002)
- Moves "is_write_index": true from 000001 to 000002. From that moment, all new Functionbeat writes go to the new index — automatically.
After rollover:
- functionbeat-000001 becomes read-only, but still searchable.
- ILM will later delete it when it ages out (based on your policy).
So that last command effectively bootstraps the first generation of an ILM-managed index family.
- ILM policy: Automates rollover, delete, etc.
- Rollover action: Creates a new index and shifts the alias
- Alias requirement: Required, used for write continuity
- Data stream alternative: Better option, handles rollover and aliasing for you
Index Template
Index templates do not retroactively apply to existing indices. They only apply automatically to new indices created after the template exists.
When we define an index template like:
PUT _index_template/functionbeat
{
"index_patterns": ["functionbeat-*"],
"template": {
"settings": {
"index.lifecycle.name": "functionbeat"
}
}
}
That template becomes part of the index creation logic.
So:
When a new index is created (manually or via rollover),
→ Elasticsearch checks all templates matching the name.
→ The matching template(s) are merged into the new index settings.
Existing indices are not touched or updated.
If we already have an index — e.g. functionbeat-8.7.1 — that matches the template pattern, it won’t automatically get the template settings.
We need to apply those manually, for example:
PUT functionbeat-8.7.1/_settings
{
"index.lifecycle.name": "functionbeat",
"index.lifecycle.rollover_alias": "functionbeat-write"
}
Now the existing index is under ILM control (using the same settings the template would have applied if it were created fresh).
Elasticsearch treats index templates as blueprints for new indices, not as live configurations.
This is intentional — applying settings automatically to existing indices could cause:
- unintended allocation moves,
- mapping conflicts,
- or lifecycle phase resets.
We want to keep as least as possible data in Elasticsearch. If data stored are logs, we want to:
- make sure apps are sending only meaningful logs
- make sure we capture repetitive error messages so the app can be fixed and stop emitting them
Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.
If index template consists of several component templates and some of these templates impose ILM policies, ILM policy of the last component template wins and is chosen to be the one applied to index or data stream based on index template.
Data retention is applied at the document level while ILM policy is applied at the index level, ILM policies take precedence. Data retention is defined in Component Templates.
Shards and Replicas
We can set the number of shards and replicas per index in Elasticsearch when we create the index, and we can dynamically update the number of replicas (but not the number of primary shards) for existing indices.
Setting Shards and Replicas on Index Creation
Specify the desired number in the index settings payload:
PUT /indexName
{
"settings": {
"index": {
"number_of_shards": 6,
"number_of_replicas": 2
}
}
}
This creates the index with 6 primary shards and 2 replicas per primary shard.
Adjusting Replicas After Creation
You can adjust the number of replicas for an existing index using the settings API:
PUT /indexName/_settings
{
"index": {
"number_of_replicas": 3
}
}
Replicas can be changed at any time, but the number of primary shards is fixed for the lifetime of the index.
Shard and Replica Principles
Each index has a configurable number of primary shards.
Each primary shard can have multiple replica shards (copies).
Replicas improve fault tolerance and can spread search load.
We should choose shard and replica counts based on data size, node count, and performance needs. Adjusting these settings impacts resource usage and indexing/search performance.
Index Size
To find out the size of each index (shards) we can use the following Kibana DevTools query:
GET /_cat/shards?v&h=index,shard,prirep,state,unassigned.reason,node,store&s=store:desc
The output contains the following columns:
- index - index name
- shard - order number of a (primary) shard. If we have 2 shards and 2 replicas, we'd have 4 rows, with shard=0 for first two rows (first primary and replica) and shard=1 for next two rows (second primary and replica)
- prirep - is shard a primary (p) or replica (r)
- state - e.g. STARTED
- unassigned
- reason
- node - name of the node
- store - used storage (in gb, mb or kb)
Each shard should not be larger than 50GB. We can impose this via Index Lifecycle Policy where we can set rollover criteria.
Document Routing
Routing in an Elasticsearch cluster determines which shard a document is sent to and stored in. The default process uses a hash of the document's _id to find the shard, ensuring an even distribution. However, you can implement custom routing to a specific field value to ensure related documents are on the same shard, which improves search performance by reducing the scope of queries.
How routing works
Default routing: When a document is indexed, Elasticsearch calculates the target shard by hashing the document's ID and using the formula shard = hash(_routing) % number_of_primary_shards. The default _routing value is the _id of the document.
Custom routing: You can specify a different routing value, like a user ID or a country code, by providing it during indexing. This directs all documents with the same routing value to the same shard, which can significantly speed up queries that filter by that value.
Querying with custom routing: When you perform a query, you can provide the same routing value.
Elasticsearch will then only search the specific shard containing documents with that value, rather than searching all shards in the cluster.
Benefits of custom routing
Improved search speed: By narrowing the search to a specific shard, you reduce the amount of data that needs to be searched, leading to faster results.
Efficient resource use: Routing minimizes the computational overhead on the cluster because nodes don't have to process queries that are irrelevant to their data.
Scalability for multitenant applications: Routing is crucial for horizontal scaling in applications with multiple tenants, as it can isolate each tenant's data to specific shards.
Considerations
Data distribution: If you use a custom routing value, ensure the data is relatively evenly distributed across all shards. If one shard accumulates a disproportionate amount of data, it can create performance bottlenecks.
Security: For multitenant applications, the application layer must handle security and access checks to prevent users from querying data from another user's shard, as Elasticsearch does not enforce this isolation automatically.
Shard Allocation
Just as there are rules which determine in which shard should a document be written, there are rules which determine onto which node a shard should be pushed to (allocated).
Watermark thresholds
Low watermark: The default is 85%. When a node's disk usage exceeds this limit, Elasticsearch stops allocating new shards to that node.
High watermark: The default is 90%. If a node's disk usage goes above this threshold, Elasticsearch starts relocating existing shards away from that node to other nodes with more available space.
Flood stage watermark: The default is 95%. When this threshold is reached, Elasticsearch makes all indices on that node read-only to prevent further data from being written, though reads are still possible.
Configuration and use cases
Monitoring and prevention: These settings are crucial for preventing nodes from running out of storage, which can cause shard failures and instability.
Proactive scaling: You can set the thresholds based on your infrastructure's growth. For instance, if you anticipate nodes filling up quickly, you might set lower thresholds to proactively distribute the load.
Dynamic systems: Using percentage values is best for dynamic systems where disk sizes can vary. You can also use absolute byte values for fixed-size storage environments.
Cluster settings: These settings are cluster-wide and are managed through the Elasticsearch configuration file (elasticsearch.yml) or by using the cluster update settings API.
Node role-specific settings: For advanced setups, you can configure different flood stage watermarks for nodes with different roles, such as hot, warm, and cold nodes, allowing for more tailored allocation strategies.
How to manage them
Review current settings: You can check the current watermark settings using the
cluster.routing.allocation.disk.watermark.low, high, and flood_stage settings in the cluster settings API.
Change settings: To modify the watermarks, update the cluster settings via the API or by editing the elasticsearch.yml configuration file on the master nodes and restarting.
Recommendations: It's recommended to have enough buffer space (e.g., 3x the size of your largest shard) to handle shard relocation and potential growth.
To check the current allocation watermarks (for each node) in cluster:
GET _cluster/settings?include_defaults=true
Output snippet:
"routing": {
"use_adaptive_replica_selection": "true",
"rebalance": {
"enable": "all"
},
"allocation": {
"enforce_default_tier_preference": "true",
"node_concurrent_incoming_recoveries": "2",
"node_initial_primaries_recoveries": "4",
"desired_balance": {
"max_balance_computation_time_during_index_creation": "1s",
"progress_log_interval": "1m",
"undesired_allocations": {
"log_interval": "1h",
"threshold": "0.1"
}
},
"same_shard": {
"host": "false"
},
"total_shards_per_node": "-1",
"type": "desired_balance",
"disk": {
"threshold_enabled": "true",
"reroute_interval": "60s",
"watermark": {
"flood_stage.frozen.max_headroom": "20GB",
"flood_stage": "95%",
"high": "90%",
"low": "85%",
"flood_stage.frozen": "95%",
"flood_stage.max_headroom": "100GB",
"low.max_headroom": "200GB",
"high.max_headroom": "150GB"
}
},
"awareness": {
"attributes": [
"k8s_node_name"
]
},
"balance": {
"disk_usage": "2.0E-11",
"index": "0.55",
"threshold": "1.0",
"shard": "0.45",
"write_load": "10.0"
},
"enable": "all",
"node_concurrent_outgoing_recoveries": "2",
"allow_rebalance": "always",
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "2"
}
},
We can see routing.allocation.disk.watermark settings.
If allocation of some shard onto node of target type fails, we can check the reason:
GET /_cluster/allocation/explain
...might have the output which unveils that the root cause for shard being unassigned is no enough storage on that node:
"node_allocation_decisions": [
{
"node_id": "8r4E9pZL........wwAw",
"node_name": "data-0",
"transport_address": "10.22.31.122:9300",
"node_attributes": {
"k8s_node_name": "ip-10-22-18-240.us-east-1.compute.internal",
"ml.machine_memory": "8589934592",
"ml.max_jvm_size": "3221225472",
"type": "data",
"ml.allocated_processors_double": "1.0",
"ml.config_version": "12.0.0",
"transform.config_version": "10.0.0",
"xpack.installed": "true",
"ml.allocated_processors": "1"
},
"roles": [
"data",
"data_cold",
"data_content",
"data_frozen",
"data_hot",
"data_warm",
"ingest",
"master",
"ml",
"remote_cluster_client",
"transform"
],
"node_decision": "no",
"deciders": [
{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], having less than the minimum required [95.8gb] free space, actual free: [67.5gb], actual used: [89.4%]"
}
]
},
---


No comments:
Post a Comment