Configuring Cluster Snapshotting in SAF
Snapshots are used for backing up indices and restoring data in case of failures.
Information
Terminology used:
OS_IP- the IP address of one of the OpenSearch cluster serversOS_HOME- the home directory of SA Data Storage, typically/app/opensearch/BACKUP_DIR- directory for storing snapshotsREPO_NAME- snapshot repository namingSNAPSHOT_POLICY_NAME- snapshot policy namingSNAPSHOT_NAME- snapshot nameHADOOP_HOME- Apache Hadoop home directoryHADOOP_DATA- Hadoop data storage directoryHADOOP_HOST- HDFS node addressHADOOP_USER- Hadoop user
Preparing Cluster Nodes
Preparation must be performed on all nodes with the data role.
For multi-node clusters, it is recommended to disable allocation before preparing the nodes. This can be done via the Developer Console (Main Menu - System Settings - Dev Console) executing the following command:
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "none"
}
}
Alternatively, use the terminal with the following command:
curl -XPUT -k -u admin "https://$OS_IP:9200/_cluster/settings?pretty" -H "Content-Type: application/json" -d '{"persistent":{"cluster.routing.allocation.enable": "none"}}'
After all cluster nodes are prepared, re-enable allocation:
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}
Or use the terminal:
curl -XPUT -k -u admin "https://$OS_IP:9200/_cluster/settings?pretty" -H "Content-Type: application/json" -d '{"persistent":{"cluster.routing.allocation.enable": "all"}}'
FS Storage Type
Before configuring snapshotting, you must create a directory where SAF will store backups. Open a terminal as the root user and execute the required commands to create the directory and set the appropriate permissions:
- Create a directory for storing snapshots and grant read/write permissions to the
opensearch:
mkdir -p $BACKUP_DIR
chown -R opensearch:opensearch $BACKUP_DIR
- Edit the node configuration file
$OS_HOME/opensearch.ymlusing any text editor, and add thepath.repoparameter pointing to the created directory:
path.repo: ["{BACKUP_DIR}"]
It is not recommended to place path.repo on the same disk as the node's data directory.
The specified path must be identical across all nodes participating in snapshot creation. If even a single node lacks access to this path, the snapshot operation may fail.
- Restart the node to apply the changes:
systemctl restart opensearch
HDFS Storage Type
To use an HDFS cluster as a snapshot repository, you need to install the repository-hdfs plugin in opensearch and reload the service:
$OS_HOME/bin/opensearch-plugin install repository-hdfs
systemctl restart opensearch
Deploy on the node selected as the storage Apache Hadoop:
- Download the Apache Hadoop archive using the
wgetcommand:
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar.gz
- Create folders
$HADOOP_HOMEand$HADOOP_DATA, extract the archive and grant permissions to user$HADOOP_USER:
mkdir -p $HADOOP_HOME
mkdir -p $HADOOP_DATA
tar -xzf hadoop-3.4.2.tar.gz -C $HADOOP_HOME --strip-components 1
chown -R $HADOOP_USER:$HADOOP_USER $HADOOP_HOME
chown -R $HADOOP_USER:$HADOOP_USER $HADOOP_DATA
- For compiling Apache Hadoop, it is recommended to use java 8
Install openjdk-8-jdk:
apt update
apt install openjdk-8-jdk
HDFS Configuration
All subsequent operations must be performed under the $HADOOP_USER account.
- Configure the environment file
$HADOOP_HOME/etc/hadoop/hadoop-env.shwith any convenient editor, uncomment the line# export JAVA_HOME=and add the path to java 8
export JAVA_HOME={JAVA_HOME}
- Edit the file
$HADOOP_HOME/etc/hadoop/core-site.xml, adding the following setting:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://<HADOOP_HOST>:<PORT></value>
</property>
</configuration>
The port must be either 9000 or 8020. Hadoop operation is not guaranteed with other ports.
- Next, open the file
$HADOOP_HOME/etc/hadoop/core-site.xmland add paths for data storage:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>{HADOOP_DATA}/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>{HADOOP_DATA}/datanode</value>
</property>
</configuration>
- For hadoop to work, it needs the ability to ssh connect to localhost without an access key. Check the connection with the following command:
ssh localhost
If a password is requested when connecting, use the following commands:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
- Format the storage before starting work:
$HADOOP_HOME/bin/hdfs namenode -format
- Configure variables for
$HADOOP_USERin the file/etc/profile.d/hadoop.sh:
export HADOOP_HOME={HADOOP_HOME}
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HDFS_NAMENODE_USER=$HADOOP_USER
export HDFS_DATANODE_USER=$HADOOP_USER
export HDFS_SECONDARYNAMENODE_USER=$HADOOP_USER
- Start DFS with the
start-dfs.shfile:
$HADOOP_HOME/sbin/start-dfs.sh
Check the startup with the jps command. If NameNode, DataNode and SecondaryNameNode are present in the output, the startup was successful. The manager is available at http://<HADOOP_HOST>:9870, the file system can be viewed at Utilities->Browse the file system.
- Configure a folder for storing snapshots in HDFS (the path
/{REPO_NAME}/{CLUSTER_NAME}is chosen as an example) and grant access rights toopensearch:
hdfs dfs -mkdir -p /{REPO_NAME}/{CLUSTER_NAME}
hdfs dfs -chown -R opensearch:supergroup /{REPO_NAME}
Preparing the Snapshot Repository
A snapshot repository must be created with the necessary parameters. Supported storage types include FS, S3, HDFS, and others.
FS Repository
PUT /_snapshot/{REPO_NAME}
{
"type": "fs",
"settings": {
"location": "{BACKUP_DIR}"
}
}
List of FS Snapshot Repository Parameters:
| Parameter | Description |
|---|---|
location | Directory where snapshots will be stored. |
chunk_size | Splits large files into smaller chunks during snapshot creation (e.g., 64MB, 1GB). Default: 1gb. (Optional) |
compress | Boolean Whether to compress metadata files. Default: false. (Optional) |
max_restore_bytes_per_sec | Maximum snapshot restore speed. Default: 40 MB/s. (Optional) |
max_snapshot_bytes_per_sec | Maximum snapshot creation speed. Default: 40 MB/s. (Optional) |
remote_store_index_shallow_copy | Boolean. Determines whether to store index snapshots as shallow copies. Default: false. (Optional) |
shallow_snapshot_v2 | Boolean. Enables second-generation shallow snapshotting. Default: false. (Optional) |
readonly | Boolean. Whether the repository is read-only. Default: false. (Optional) |
HDFS Repository
PUT _snapshot/{REPO_NAME}
{
"type": "hdfs",
"settings": {
"uri": "hdfs://<HADOOP_HOST>:<PORT>/",
"path": "/{REPO_NAME}/{CLUSTER_NAME}"
}
}
Table of HDFS snapshot repository parameters:
| Parameter | Purpose |
|---|---|
uri | Directory for storing snapshots. Required parameter |
path | Path in the HDFS file system where snapshots will be stored. Required parameter |
load_defaults | Whether to load default Hadoop configurations from classpath. Default - true Optional parameter |
conf.<key> | Allows passing any specific Hadoop settings. The full list is in core and hdfs parameter lists. Optional parameter |
compress | Compress metadata files. Default - true. Optional parameter |
readonly | Make the repository read-only. Optional parameter |
Using HDFS as a snapshot storage allows using any snapshots as a recovery point for any cluster. To do this, you need to add a repository with any name different from $REPO_NAME with the readonly parameter to prevent accidental writing:
PUT _snapshot/{SEARCHABLE_REPO_NAME}
{
"type": "hdfs",
"settings": {
"uri": "hdfs://<HADOOP_HOST>:<PORT>/",
"path": "/{REPO_NAME}/{SEARCHABLE_CLUSTER_NAME}",
"readonly": true
}
}
Configuring Automatic Snapshots
To automate cluster snapshotting, create a snapshot policy using the Developer Console (Main Menu - System Settings - Dev Console) by executing a command with the required parameters:
POST _plugins/_sm/policies/{SNAPSHOT_POLICY_NAME}
{
"name": "snapshot-daily-{{date}}",
"description": "Daily snapshot policy",
"creation": {
"schedule": {
"cron": {
"expression": "0 8 * * *",
"timezone": "UTC"
}
},
"time_limit": "1h"
},
"deletion": {
"schedule": {
"cron": {
"expression": "0 8 * * *",
"timezone": "UTC"
}
},
"condition": {
"max_age": "7d",
"max_count": 50,
"min_count": 30
},
"time_limit": "1h"
},
"snapshot_config": {
"date_format": "yyyy-MM-dd-HH:mm",
"timezone": "UTC",
"indices": [".*"],
"repository": "{REPO_NAME}",
"ignore_unavailable": "true",
"include_global_state": "false",
"partial": "true",
"metadata": {
"any_key": "any_value"
}
}
}
Snapshot policy parameter table:
| Parameter | Type | Description |
|---|---|---|
description | String | Description of the snapshot policy. (Optional) |
enabled | Boolean | Whether the policy should be enabled upon creation. (Optional) |
snapshot_config | Object | Configuration settings for the snapshot. (Required) |
snapshot_config.date_format | String | Snapshot names follow the forma {SNAPSHOT_POLICY_NAME}-<data>-<random number>. date_format defines the date format in the snapshot name. Default: yyyy-MM-dd’T’HH:mm:ss. (Optional) |
snapshot_config.date_format_timezone | String | Snapshot names follow the forma {SNAPSHOT_POLICY_NAME}-<data>-<random number>. date_format_timezone Time zone used for the date in snapshot names. Default: UTC. (Optional) |
snapshot_config.indices | String | Pattern for the indices to include in snapshots. Default: *. (all indexes) |
snapshot_config.repository | String | Name of the repository where snapshots will be stored. (Required) |
snapshot_config.ignore_unavailable | Boolean | Whether to ignore unavailable indices. Default: false. (Optional) |
snapshot_config.include_global_state | Boolean | Whether to include cluster state in the snapshot. Default: true. (Optional) |
snapshot_config.partial | Boolean | Allows creation of partial snapshots. Default: false. (Optional) |
snapshot_config.metadata | Object | Key-value metadata associated with the snapshot. (Optional) |
creation | Object | Settings for snapshot creation. (Required) |
creation.schedule | String | Cron expression defining the snapshot schedule. (Required) |
creation.time_limit | String | Maximum duration to wait for snapshot creation to complete. If time_limit exceeds the interval between scheduled snapshots, the next snapshot will not start until the previous one completes. (Optional) |
deletion | Object | Settings for snapshot deletion. (Optional) By default, all snapshots are retained. |
deletion.schedule | String | Cron expression for snapshot deletion. (Optional) Defaults to the creation.schedule setting. |
deletion.time_limit | String | Maximum time allowed for completing snapshot deletion. (Optional) |
deletion.delete_condition | Object | Conditions that trigger snapshot deletion. (Optional) |
deletion.delete_condition.max_count | Integer | Maximum number of snapshots to retain. (Optional) |
deletion.delete_condition.max_age | String | Maximum age for retained snapshots. (Optional) |
deletion.delete_condition.min_count | Integer | Minimum number of snapshots to retain. Default: 1. (Optional) |
notification | Object | Notification settings for snapshot policy events (requires a configured OpenSearch notification channel). (Optional) |
notification.channel | Object | Defines the notification channel. (Required) |
notification.channel.id | String | ID of the notification channel. (Required) |
notification.conditions | Object | Events that trigger notifications — set to true to enable. |
notification.conditions.creation | Boolean | Whether to notify when a snapshot is created. Default: true. (Optional) |
notification.conditions.deletion | Boolean | Whether to notify when a snapshot is deleted. Default: false. (Optional) |
notification.conditions.failure | Boolean | Whether to notify on snapshot creation or deletion failure. Default: false. (Optional) |
notification.conditions.time_limit_exceeded | Boolean | Whether to notify when snapshot operations exceed time_limit. Default: false. (Optional) |
Snapshots are incremental — previously saved segments are not duplicated.
Manual Snapshot Creation
To create a one-time snapshot, execute the following command in the Dev Console:
PUT _snapshot/{REPO_NAME}/{SNAPSHOT_NAME}
{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": false
}
Restoring from a Snapshot
If planning to restore data on a target cluster from another cluster's snapshot, ensure that the OpenSearch versions on the clusters match.
- View available snapshots using the following command in the Dev Console:
GET _snapshot/{REPO_NAME}/_all
The output will list all created snapshots in the following format, where the snapshot field indicates the snapshot name (SNAPSHOT_NAME):
{
"snapshots": [
{
"snapshot": "daily-snapshot-sm-policy-2025-04-21-14:30-bu2lfnek",
"uuid": "iqHGMwR5T6yV-tInX3a5KQ",
"version_id": 136397827,
"version": "2.18.0",
"remote_store_index_shallow_copy": false,
"indices": [
"index1",
"index2"
],
"data_streams": [],
"include_global_state": true,
"metadata": {
"sm_policy": "daily-snapshot-sm-policy"
},
"state": "SUCCESS",
"start_time": "2025-04-21T14:30:02.327Z",
"start_time_in_millis": 1745245802327,
"end_time": "2025-04-21T14:31:09.966Z",
"end_time_in_millis": 1745245869966,
"duration_in_millis": 67639,
"failures": [],
"shards": {
"total": 2,
"failed": 0,
"successful": 2
}
}
]
}
- Restore the desired snapshot using the following command in the Dev Console:
POST _snapshot/{REPO_NAME}/{SNAPSHOT_NAME}/_restore
Snapshot Deletion
- View available snapshots using the following command in the Dev Console:
GET _snapshot/{REPO_NAME}/_all
This will return the list of snapshots in the same format as described in the Restoring from a Snapshot section.
- Delete the desired snapshot using the following command:
DELETE _snapshot/{REPO_NAME}/{SNAPSHOT_NAME}