Skip to main content
Version: 5.0

Operating System Update

Abbreviations and variables used in the article

  1. OS - operating system, e.g., RHEL
  2. DB - database
  3. USER_ADMIN - username with sufficient privileges, often admin
  4. SERVER_HOST - IP address or domain name of one of the OpenSearch cluster servers (not necessarily the one being worked on)
  5. PATH_SSL - directory where certificates are located, usually /app/opensearch/config

Updating the OS on the server

SAF components, in most cases, do not affect system packages, so updating the OS should not affect the product's functionality. The general approach to updating the OS can be broken down into the following steps and is best performed sequentially on each server where SAF components are installed:

  1. Ensure that the updated OS is on the supported list for SAF
  2. Review the list of changes during the update
  3. When working with OpenSearch nodes, disable allocation before starting work
  4. Stop SAF components
  5. Update the OS and reboot the server
  6. Start SAF components, verify functionality, and enable allocation

It is recommended to test the update on a test server. Alternatively, you can update the OS on the self-monitoring server (selfmon) - this server usually includes almost all SAF components used on the main servers.

The recommended order for updating nodes is as follows:

  1. OpenSearch cluster nodes with the Data role and cold data distribution type
  2. OpenSearch cluster nodes with the Data role and warm/hot data distribution type
  3. OpenSearch cluster nodes with the Master role and other roles
  4. OpenSearch Dashboards servers
  5. Logstash servers
  6. Other SAF servers

More details on the update steps can be found below.

List of Changes During Update

The main components of the SAF system, in most cases, do not use system packages. Below is a list of things to pay attention to:

  1. Pay attention to the orchestration (management, Ansible) server. In particular, how Ansible itself is installed. When installing from system packages, some Ansible playbooks may stop working when updating versions
  2. When using Ansible Semaphore on the Ansible server, you may need to check the docker container (if used), as well as the database used, for example, the PostgreSQL version
  3. Data collection servers may use scripts that utilize system utilities, such as Python. If the update is from Python 2.7 to Python 3.6, the scripts will most likely not work
  4. Changes in security policies and security utilities. For example, it is worth carefully reviewing changes in SELinux, ACL, and the firewall

Actions on the OpenSearch Cluster

To disable allocation on the cluster, run the following command in the developer console (Main Menu - Settings - Dev Console) of the SA web interface:

PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "none"
}
}

Alternatively, you can execute the request from the command line:

curl -XPUT -k -u $USER_ADMIN https://$SERVER_HOST:9200/_cluster/settings?pretty \
-H "Content-Type: application/json" \
-d '{"persistent":{"cluster.routing.allocation.enable": "none"}}'

You can omit the user and key -u, but then you need to specify the super admin certificate and key. The command will look like this:

curl -XPUT -k  https://$SERVER_HOST:9200/_cluster/settings?pretty \
-H "Content-Type: application/json" \
-d '{"persistent":{"cluster.routing.allocation.enable": "none"}}' \
--cacert $PATH_SSL/ca-cert.pem \
--cert $PATH_SSL/admin-cert.pem \
--key $PATH_SSL/admin-key.pem

To enable allocation (enable after connecting the node to the cluster), execute the command above, but replace none with all, for example, from the developer console (Main Menu - Settings - Dev Console) of the web interface:

PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}

Alternatively, you can execute the request from the command line:

curl -XPUT -k -u $USER_ADMIN https://$SERVER_HOST:9200/_cluster/settings?pretty \
-H "Content-Type: application/json" \
-d '{"persistent":{"cluster.routing.allocation.enable": "all"}}'

To check the cluster health, you can use the following command in the developer console:

GET _cluster/health

Alternatively, from the command line:

curl -k -u $USER_ADMIN https://$SERVER_HOST:9200/_cluster/health?pretty

To check the list of nodes and their roles, you can use the following command:

GET _cat/nodes

From the command line:

curl -XPUT -k -u $USER_ADMIN https://$SERVER_HOST:9200/_cat/nodes

To view OpenSearch node parameters, in particular the data distribution type (routing_mode), use the following command:

GET _cat/nodeattrs

From the command line:

curl -XPUT -k -u $USER_ADMIN https://$SERVER_HOST:9200/_cat/nodeattrs

Disabling Services Before Updating the OS

The main SAF services are managed using systemctl:

systemctl stop opensearch
systemctl stop opensearch-dashboards
systemctl stop logstash
systemctl stop safBeatManager
systemctl stop safBeat