aggs
Description
Performs statistical operations on data using internal storage mechanisms.
Using aggs in a query is valid only if it follows commands that also use internal storage mechanisms. These include source, search, and peval. This condition must also hold for all subqueries within the query.
Syntax
| aggs [composite=<bool>] <functions-expression> ["," <functions-expression>] [<by_expression>]
Required Arguments
At least one function must be used:
| Parameter | Syntax | Description |
|---|---|---|
count | count | count(<field>) | Calculates the number of events containing a field. If no field is specified, it calculates the total number of events. |
values | values(<field>) | Computes an array of unique values for the given field. |
avg | avg(<field>) | Computes the average value for the given field. |
dc | dc(<field>) | Counts the number of unique values in the specified field. |
earliest | earliest(<field>) | Computes the field value for the earliest event. An optional second parameter is the name of the timestamp field (default: @timestamp). |
latest | latest(<field>) | Computes the field value for the latest event. An optional second parameter is the name of the timestamp field (default: @timestamp). |
max | max(<field>) | Calculates the maximum value for the given field. |
min | min(<field>) | Calculates the minimum value for the given field. |
sum | sum(<field>) | Computes the sum of values for the given field. |
Optional Arguments
| Parameter | Syntax | Default | Description |
|---|---|---|---|
composite | composite=<boolean> | false | Allows the use of scrolling in aggregations to obtain all possible segments (buckets) with multiple queries (similar to scroll in stats). It can only be used if there is grouping (by fields). The number of segments (buckets) that will be returned is fixed—1000. |
<by_expression> | <by_field> ["," <by_field> ...] | The name of the field (or fields) for grouping values. |
The composite argument is available when querying OpenSearch.
Aggregations in OpenSearch perform statistical processing on numeric fields or keyword if the field is text-based. For text fields, you need to append <field-name>.keyword, which should be done for both specified and by fields. Exceptions are specified fields in functions like earliest and latest. Example:
...
| aggs avg(user.keyword), earliest(user) by event.keyword, user_count
Query Examples
Basic Examples
Example 1
Calculating the count of events for each combination of HTTP status code and host name.
source server_info
| aggs count by status, host.keyword
Since the host field is text-based, you need to refer to it via .keyword.
When using by fields in the query, the result is a row for each unique by field value, containing that value and the result of the statistical functions. Because this example uses two by fields, each unique combination of status and host will be in a separate row.
The result of the query could be the following table:
| count | status | host.keyword |
|---|---|---|
| 1595 | 200 | host1 |
| 1498 | 200 | host2 |
| 3567 | 200 | host3 |
| 254 | 400 | host1 |
| 123 | 400 | host3 |
| 98 | 404 | host2 |
| 279 | 404 | host3 |
Example 2
Calculating the list of hosts and the count of actions for each user and their action on the hosts.
source wineventlog
| aggs count, values(host.name.keyword) as hosts by user.name.keyword, event.action.keyword
For each unique pair event.action and user.name, the query computes the count of events and a list of unique hosts, stored in the count and hosts fields respectively.
The result of the query could be the following table:
| user.name.keyword | event.action.keyword | count | hosts |
|---|---|---|---|
| user1 | credential-validated | 41 | host1 host2 |
| user1 | logged-in | 26 | host2 host3 host5 |
| user1 | logged-out | 25 | host2 host3 host5 |
| user2 | logged-out | 33 | host2 host5 |
| user2 | added-member-to-group | 6 | host1 |
| user3 | added-member-to-group | 14 | host2 host3 |
Advanced Examples
Example 1
Getting a list of unique values for the user field:
...
| aggs values(user.keyword)
Example 2
Getting a list of unique values for the user field for each unique event field value.
In this example, the event field in the data has values start and end. After running the command, the system will return pairs of results: start with a list of unique user values (event = "start"), and end with a list of unique user values (event = "end").
...
| aggs values(user.keyword) by event.keyword
Example 3
A list of all unique values for the user and message fields:
...
| aggs values(user.keyword), values(message.keyword)
Example 4
A list of unique user and message values, where the combination of user and message values is the same:
...
| aggs values(user.keyword), values(message.keyword) by user.keyword, message.keyword
Example 5
The number of documents containing the user field:
...
| aggs count(user.keyword)
Example 6
The number of documents containing the message field for each unique user field value:
...
| aggs count(message.keyword) by user.keyword
Example 7
The number of documents containing the user field and the number of documents containing the message field:
...
| aggs count(user.keyword), count(message.keyword)
Example 8
The number of documents containing the user field for each unique combination of user and message field values:
...
| aggs count(user.keyword) by user.keyword, message.keyword
Example 9
This example retrieves the content of the event field for the earliest and the latest records:
...
| aggs earliest(event), latest(event)
Example 10
The number of unique values in the user field:
...
| aggs dc(user)