Skip to main content
Version: 5.2

aggs

Description

Performs statistical operations on data using internal storage mechanisms.

danger

Using aggs in a query is valid only if it follows commands that also use internal storage mechanisms. These include source, inputlookup, search, and peval. This condition must also hold for all subqueries within the query.

Syntax

| aggs [composite=<bool>] <functions-expression> ["," <functions-expression>] [<by_expression>]

Required Arguments

At least one function must be used:

ParameterSyntaxDescription
countcount | count(<field>)Calculates the number of events containing a field. If no field is specified, it calculates the total number of events.
valuesvalues(<field>)Computes an array of unique values for the given field.
avgavg(<field>)Computes the average value for the given field.
dcdc(<field>)Counts the number of unique values in the specified field.
earliestearliest(<field>)Computes the field value for the earliest event. An optional second parameter is the name of the timestamp field (default: @timestamp).
latestlatest(<field>)Computes the field value for the latest event. An optional second parameter is the name of the timestamp field (default: @timestamp).
maxmax(<field>)Calculates the maximum value for the given field.
minmin(<field>)Calculates the minimum value for the given field.
sumsum(<field>)Computes the sum of values for the given field.

Optional Arguments

ParameterSyntaxDefaultDescription
compositecomposite=<boolean>falseAllows the use of scrolling in aggregations to obtain all possible segments (buckets) with multiple queries (similar to scroll in stats). It can only be used if there is grouping (by fields). The number of segments (buckets) that will be returned is fixed—1000.
<by_expression><by_field> ["," <by_field> ...]The name of the field (or fields) for grouping values.
Composite usage

The composite argument is available when querying OpenSearch.

Using Wildcards

The * symbol is supported for bulk field selection based on a pattern in aggregation functions. It can be used to select all fields matching the pattern, as well as to rename them using aliases.

Usage specifics in aggregations:

  • when using numerical aggregations (avg, sum, min, perc, etc.), only numerical fields will be considered. If the pattern matches other field types, the system will warn the user that these fields are excluded from the calculation
  • multiple uses of * within a single pattern are allowed
  • the number of fields for any aggregation is limited by the cluster configuration parameter max_fields_under_pattern, which defaults to 250
... | aggs avg(audit_*) as *
... | aggs count(audit_*_*)
Keyword usage in OpenSearch

Aggregations in OpenSearch perform statistical processing on numeric fields or keyword if the field is text-based. For text fields, you need to append <field-name>.keyword, which should be done for both specified and by fields. Exceptions are specified fields in functions like earliest and latest. Example:

...
| aggs avg(user.keyword), earliest(user) by event.keyword, user_count

Query Examples

warning

The examples below use datasets for mapping the .keyword field. If default indexing is configured, the .keyword suffix must be specified.

Mapping Settings

A text field is broken down into individual words during indexing, allowing for partial matches to be found (e.g., finding "cat" within the word "kitten"). This is convenient for full-text search but requires more resources.

A keyword field is not tokenized—it is stored and searched as a whole. It is suitable for exact matches (e.g., searching for "Dubai" only as a complete value). It performs faster.

Basic Examples

Example 1

Calculating the count of events for each combination of DNS status code and host name.

source winlogbeat_dns
| aggs count by event.code, host.keyword
| sort - count

When using by fields in the query, the result is a row for each unique by field value, containing that value and the result of the statistical functions. Because this example uses two by fields, each unique combination of status and host will be in a separate row. The sort command sorts by the count field by default.

The result of the query could be the following table:

countevent.codehost.keyword
15953DXB-DC-01
14985504BJS-DC-02
3564015SIN-DC-03
2544013DXB-DC-01
1234007SIN-DC-03
984013BJS-DC-02
795504SIN-DC-03

Example 2

Calculating the list of hosts and the count of actions for each user and their action on the hosts.

source wineventlog
| aggs count, values(host.name.keyword) as hosts by user.name.keyword, event.action.keyword

For each unique pair event.action and user.name, the query computes the count of events and a list of unique hosts, stored in the count and hosts fields respectively.

The result of the query could be the following table:

user.name.keywordevent.action.keywordcounthosts
SmithMGcredential-validated41BJS-WS-01
SIN-WS-02
SmithMGlogged-in26SIN-WS-02
DXB-WS-03
KZN-WS-05
SmithMGlogged-out25SIN-WS-02
DXB-WS-03
KZN-WS-05
TaylorSIlogged-out33SIN-WS-02
KZN-WS-05
TaylorSIadded-member-to-group6BJS-WS-01
JohnAEadded-member-to-group14SIN-WS-02
DXB-WS-03

Advanced Examples

Example 1

Getting a list of unique values for the user field:

...
| aggs values(user.keyword)

Returns an array of unique values from the user.keyword field across all documents matching the query.

Sample input data:

usersource.addressevent
SmithMG192.168.1.10login
TaylorSI192.168.1.15logout
SmithMG192.168.1.20login
JohnAE10.0.0.1password_reset
JechVA10.0.0.5password_reset

The query execution result may be the following table:

values (user.keyword)
SmithMG
TaylorSI
JohnAE
JechVA

Example 2

Getting a list of unique values for the user field for each unique event field value.

In this example, the event field in the data has values start and end. After running the command, the system will return pairs of results: start with a list of unique user values (event = "start"), and end with a list of unique user values (event = "end").

...
| aggs values(user.keyword) by event.keyword

The query execution result may be the following table:

event.keywordvalues (user.keyword)
startSmith
Taylor
endJohn

Example 2.2

In this example, the response.status field in the data contains HTTP status values. After executing the command, the system will return pairs: 200 and a list of unique source.address.keyword values (where response.status = 200), 403 and a list of unique source.address.keyword values (where response.status = 403), and similarly for response.status = 401:

... | aggs values(source.address.keyword) by response.status

Sample input data:

user.keywordresponse.statussource.address.keywordrequest.referrer.keywordnetwork.bytes
SmithMG200192.168.1.10/data2048
TaylorSI200192.168.1.15/login1024
SmithMG403175.100.1.20/data512
JohnAE40110.0.0.1/config128

The query execution result may be the following table:

response.statusvalues (source.address.keyword)
200192.168.1.10
192.168.1.15
403175.100.1.20
40110.0.0.1

Example 3

The following examples will use input data from user action audits, a portion of which is presented below. The dataset used in the examples does not have index mapping configured.

l_timehostusertargetactionoutcome
2025-05-30 13:47:08ws-01smithfordlocked-out-user-account
changed-password
added-member-to-group
success
2025-05-30 13:47:08ws-01avdeevfordreset-password
changed-password
failure
2025-05-30 13:47:08ws-01dawsoncooperenabled-user-accountsuccess
2025-05-30 13:47:08ws-02holmesjenkinschanged-passwordfailure
2025-05-30 13:47:08ws-01owenturnerremoved-member-from-groupsuccess
2025-05-30 13:47:08ws-01taylerwatsonadded-member-to-groupfailure
2025-05-30 13:47:08ws-02wrightlloydremoved-member-from-groupfailure

Example 3.1

List of all unique values for the user and action fields:

...
| aggs values(user.keyword), values(action.keyword)

Returns arrays of unique values for user.keyword and action.keyword independently of each other.

The query execution result may be the following table:

values (user.keyword)values (action.keyword)
smith
avdeev
dawson
holmes
wright
locked-out-user-account
reset-password
enabled-user-account
changed-password
removed-member-from-group
added-member-to-group

Example 3.2

List of unique values for user.keyword and action.keyword that share the same combination of user.keyword and action.keyword field values:

...
| aggs values(user.keyword), values(action.keyword) by user.keyword, action.keyword

Groups documents by unique combinations of user.keyword and action.keyword, returning the values of these fields for each group.

The query execution result may be the following table:

user.keywordaction.keywordvalues (user.keyword)values (action.keyword)
smithlocked-out-user-accountsmithlocked-out-user-account
changed-password
added-member-to-group
smithchanged-passwordsmithlocked-out-user-account
changed-password
added-member-to-group
smithadded-member-to-groupsmithlocked-out-user-account
changed-password
added-member-to-group
avdeevreset-passwordavdeevreset-password
changed-password
avdeevchanged-passwordavdeevreset-password
changed-password
dawsonenabled-user-accountdawson["enabled-user-account"]
holmeschanged-passwordholmeschanged-password
dawsonremoved-member-from-groupdawsonremoved-member-from-group
smithadded-member-to-groupsmithadded-member-to-group
wrightremoved-member-from-groupwrightremoved-member-from-group

Example 3.3

Number of documents containing the user.keyword field:

... | aggs count(user.keyword)

The query execution result may be the following table:

count (user.keyword)
5

Example 3.4

Number of documents containing the action.keyword field for each unique value of the user.keyword field:

... | aggs count(action.keyword) by user.keyword

The query execution result may be the following table:

user.keywordcount (user.keyword)
smith955
avdeev34
dawson1007
holmes1022
wright873

Example 3.5

Number of documents containing the user.keyword field and number of documents containing the action.keyword field:

... | aggs count(user.keyword), count(action.keyword)

The query execution result may be the following table:

count (user.keyword)count (action.keyword)
53891

Example 3.6

Number of documents containing the user.keyword field for each unique combination of user.keyword and action.keyword field values:

... | aggs count(user.keyword) by user.keyword, action.keyword

The query execution result may be the following table showing the first 7 entries:

user.keywordaction.keywordcount (user.keyword)
smithlocked-out-user-account201
smithchanged-password40
smithadded-member-to-group46
wrightlocked-out-user-account271
dawsonlocked-out-user-account511
smithdisabled-user-account23
holmesadded-user-account51
avdeevreset-password1
avdeevchanged-password2
dawsonadded-user-account59

Example 3.7

In this example, the system will return the contents of the action.keyword field for the first record in the sample, the contents of the action.keyword field for the last record in the sample, and the average value of the action_count field (which was calculated by the command from the previous example) for each unique user.keyword:

... | aggs count(user.keyword) as action_count by user.keyword, event.keyword
| aggs first(action.keyword), last(action.keyword), avg(action_count.keyword) by user.keyword

The query execution result may be the following table:

user.keywordfirst (action.keyword)last (action.keyword)avg (action_count.keyword)
smithlocked-out-user-accountchanged-password95.8
dawsonlocked-out-user-accountchanged-password100.1
wrightremoved-member-from-groupreset-password102.2
holmesremoved-member-from-groupreset-password113.22
avdeevchanged-passwordreset-password1.5

Example 3.8

In this example, the system will return the contents of the action.keyword field for the earliest record in the sample and the contents of the user field for the latest record in the sample:

... | aggs earliest(action.keyword), latest(user.keyword) by @timestamp

The query execution result may be the following table showing the first 7 entries:

@timestampearliest (action.keyword)latest (user.keyword)
2025-05-30T10:20:00.000000+0000removed-member-from-groupdawson
2025-05-30T10:21:00.000000+0000added-member-to-groupdawson
2025-05-30T10:22:00.000000+0000removed-member-from-groupsmith
2025-05-30T10:23:00.000000+0000deleted-user-accountholmes
2025-05-30T10:24:00.000000+0000enabled-user-accountdawson
2025-05-30T10:25:00.000000+0000locked-out-user-accountdawson
2025-05-30T10:26:00.000000+0000unlocked-user-accountwright

Example 4

Number of unique values in the user field:

... | aggs dc(user)

Sample input data

user
Smith
Holmes
Dawson
Wright

The query execution result may be the following table:

dc(user)
3

Example 5

In this example, the system will return the value of the user field that occurs in 50 percent of cases:

... | aggs perc(user, 50)

In this example, the system groups data by delivery status status and for each group returns an array of unique vehicle identifiers vehicle, calculates the total distance distance, and determines the delivery time delivery_time that occurs in 90 percent of cases:

Пример №11
... | aggs values(vehicle.keyword), sum(distance), perc(delivery_time, 90) by status.keyword

Sample input data:

vehiclestatusdistancedelivery_time@timestamp
MTL Pounderdelivered100242025-04-30T10:00:00Z
Maibatsu Muledelivered150302025-04-30T10:10:00Z
MTL Pounderin_transit200482025-04-30T10:20:00Z
Jobiult Haulerin_transit120362025-04-30T10:30:00Z
Maibatsu Muledelivered80282025-04-30T10:40:00Z
Vapid Bensonpending50122025-04-30T10:50:00Z

The query execution result may be the following table:

status.keywordvalues (vehicle_id.keyword)sum (distance)perc (delivery_time, 90)
deliveredMTL Pounder
Maibatsu Mule
33029.6
in_transitMTL Pounder
Jobiult Hauler
32046.8
pendingVapid Benson5012

Example 6

In this example, the system groups data by combinations of transaction type transaction_type and currency currency, and for each group counts the number of unique clients client, calculates the total transaction amount amount, and returns the identifier of the earliest transaction transaction_id:

... | aggs dc(client.keyword), sum(amount), earliest(transaction_id) by transaction_type.keyword, currency.keyword

Sample input data:

clienttransaction_typecurrencyamounttransaction_id@timestamp
SmithdepositUSD100012025-04-30T10:00:00Z
HolmesdepositUSD50022025-04-30T10:10:00Z
SmithwithdrawalEUR20032025-04-30T10:20:00Z
WrightdepositEUR40042025-04-30T10:30:00Z
HolmeswithdrawalUSD30052025-04-30T10:40:00Z
SmithdepositUSD60062025-04-30T09:50:00Z

The query execution result may be the following table:

transaction_type.keywordcurrency.keyworddc (client.keyword)sum (amount)earliest (transaction_id)
depositUSD215006
depositEUR14004
withdrawalEUR12003
withdrawalUSD13005

Example 7

This example will perform several calculations – for each field matching the pattern audit_* (all fields whose names start with audit_). The output name will be formed from the dynamic part of the field name * with the suffix _count:

... | aggs count(audit_*) as *_count

Sample input data:

audit_nameaudit_statusdistancedelivery_time@timestamp
MTL Pounderdelivered100242025-04-30T10:00:00Z
Maibatsu Muledelivered150302025-04-30T10:10:00Z
MTL Pounderin_transit200482025-04-30T10:20:00Z
Jobiult Haulerin_transit120362025-04-30T10:30:00Z
Maibatsu Muledelivered80282025-04-30T10:40:00Z
Vapid Bensonpending50122025-04-30T10:50:00Z

The query execution result might be the following table:

name_countstatus_count
66

Example 8

This example uses a more complex field pattern containing two * wildcards:

... | aggs max(audit_*_*) as *_count_*

Sample input data:

audit_nameaudit_delivery_statusaudit_distanceaudit_delivery_time@timestamp
MTL Pounderdelivered100242025-04-30T10:00:00Z
Maibatsu Muledelivered150302025-04-30T10:10:00Z
MTL Pounderin_transit200482025-04-30T10:20:00Z
Jobiult Haulerin_transit120362025-04-30T10:30:00Z
Maibatsu Muledelivered80282025-04-30T10:40:00Z
Vapid Bensonpending50122025-04-30T10:50:00Z

The query execution result might be the following table:

delivery_count_time
48

It is important to note that this query uses the numeric aggregation type max – meaning the calculation will be performed on numeric fields. In this case, two fields match the pattern: audit_delivery_status and audit_delivery_time. However, the values in the audit_delivery_status field are not numeric, so the calculation was performed only on the audit_delivery_time field.