Skip to main content

SAFL optimization recommendations

The section provides examples of search optimizations in query language Search Anywhere Framework Language (далее - SAFL). For more information about the list of commands and their purpose, see here .

Full-text search simplifies the functionality for processing raw events, but its use increases the consumption of computing resources. If you know exactly which fields to search for, then it is better to specify them.

Example query for full-text search of login events:

source sm_cs_auth_indexes
| search "*4624*"

The event code is stored in the event.code field and it is recommended to use the following search:

source sm_cs_auth_indexes
| search event.code="4624"

Using wildcards

To search by value pattern in SAF it is possible to use wildcards. Searching using wildcards increases the number of events to search and requires more computing resources. The file.path field contains both the path to the file and its name

source sm_cs_auth_indexes
| search file.path="*/example.json"

In such cases, it is recommended to use data parsing from the source according to the standard ECS. After applying ECS parsing, two fields will be available: file.path - path to the file, file.name - file name. The search will be performed on the specific value for the field, in this case file.name with the value example.json.

source sm_cs_auth_indexes
| search file.name="example.json"

Both examples perform the same task, but in the second example the search is more optimized and uses minimal cluster resources.

Use filtering before data manipulation

For best performance, it is recommended to use data filtering. If you need to filter already processed data, you can use the where command.

The example searches for all events with event.code="4624" and then filters by the user.name field:

source sm_cs_auth_indexes
| search event.code="4624"
| aggs count by user.name, host.name, source.ip
| where user.name=="maksimov.m"

Search with direct condition

При выполнении прямого поиска используется меньше ресурсов, чем при обратном поиске.

Search with negation condition:

source sm_cs_auth_indexes
| search user.name!="SMART-DC$" AND event.action!="logged-in"

Search with direct condition:

source sm_cs_auth_indexes
| search user.name="maksimov.m" AND event.action="logged-out"

Statistics calculation commands

В SAF есть команды подсчета статистики: aggs и stats.

When using the aggs command, the operation occurs at the SAF Data Storage level and allows you to process a larger data set. For the stats command, it is important to take into account the qsize parameter, which filters the number of events processed. Changing the qsize parameter increases the load on RAM.

The commands for calculating statistics on a timeline work in a similar way: timeaggs и timechart.

Using сommands effectively

To optimize the search and reduce the load on the system, you need to choose the right commands.

The search engine works in such a way that the input of the next command receives the results of the previous command. For example, if you need to perform transformations on data before aggregation (the aggs command), then the eval command will not work in this case and the corresponding error will be displayed. Example:

source sm_cs_auth_indexes
| eval user.domain=lower(user.domain)
| aggs count by user.domain

In this case, you must use the peval command, which uses the internal SAF Data Storage mechanism:

source sm_cs_auth_indexes
| peval user.domain=lower(user.domain)
| aggs count by user.domain
How to view fields

Для просмотра имеющихся полей в событии после обработки можно воспользоваться командой | table *.

In SAF the user can enrich the search query data using a nested subquery. To do this, use the join command:

source sm_cs_auth_indexes
| search user.name!="*$" AND user.name!="unknown" AND event.code=4624
| peval source.address=coalesce(source.address, source.ip)
| aggs count, latest(@timestamp) as latest_time, earliest(@timestamp) as earliest_time, values(source.address) as source.address_values, values(winlog.computer_name) as winlog.computer_name_values by user.name
| eval duration=strptime(latest_time, "YYYY-MM-dd'T'HH:mm:ss.SSS'Z'") - strptime(earliest_time, "YYYY-MM-dd'T'HH:mm:ss.SSS'Z'")
| eval user.name=lower(user.name)
| join user.name
[ source sm_cs_auth_indexes
| search event.code=4634
| peval user.name=lower(user.name)
| aggs max(@timestamp) as time_password_last_change by user.name ]

SAF allows you to use multiple sources in one search query and combine the resulting data.