# Commands for searching in Clickhouse
To search the Clickhouse repository, use the commands source
, search
, aggs
, timeaggs
и peval
.
source
For more information about source command see Clickhouse Search Basics
search
Command for searching data. For more information, see search
.
Examples
Example 1
In this example, search for events for which the host.hostname
field contains the value host1*
.
source clk:sysmon
| search host.hostname = "host1*"
| table host.hostname
Search query result:
host.hostname |
---|
host11 |
host14 |
host17 |
host156 |
Example 2
Search for events where filtering is performed by the fields host.name
, host.ip
, name`.
source clk:sysmon
| search host.name = "host1*" and cidr host.ip = "178.0.0.0/8" and name = "CPU utilization"
| talbe host.name, host.ip, name
Search query result:
host.name | host.ip | name |
---|---|---|
host12 | 178.16.0.33 | CPU utilization |
host133 | 178.26.0.89 | CPU utilization |
aggs
This command performs statistical operations on data using internal storage algorithms. For more information see aggs
.
- Don't use
.keyword
to access text fields - No
composite
parameter - Features available
last
,first
,range
,stdev
иlist
List of available functions for the aggs command in Clickhouse:
Parameter | Syntax | Description |
---|---|---|
count | count | count(<field>) | Calculates the number of events that contain the field. If the field is not specified, calculates the total number of events. |
values | values(<field>) | Calculates an array of unique values for the given field. |
avg | avg(<field>) | Calculates the average value for the specified field. |
dc | dc(<field>) | Calculates the number of unique values in the given field. |
earliest | earliest(<field>) | Calculates the field value for the earliest event. The second (optional) parameter is the field name with a timestamp. Defaults to @timestamp . |
first | first(<field>) | Calculates the first value given the given field. |
last | last(<field>) | Calculates the last value for the given field. |
latest | latest(<field>) | Calculates the field value for the most recent event. The second (optional) parameter is the field name with a timestamp. Defaults to @timestamp . |
list | list(<field>) | Calculates an array of all values for the given field. |
max | max(<field>) | Calculates the maximum value for a given field. |
min | min(<field>) | Calculates the minimum value for a given field. |
range | range(<field>) | Calculates the difference between the maximum and minimum values for a given field. |
stdev | stdev(<field>) | Calculates the standard deviation for a given field. |
sum | sum(<field>) | Calculates the sum of values for a given field. |
Examples
Example 1
The example will calculate the number of events for each unique value in the event.code
field:
source clk:sysmon
| aggs count by event.code
The result of the search query:
event.code | count |
---|---|
1 | 346 |
7 | 232 |
12 | 783 |
22 | 456 |
Example 2
The example will find the value of the last process_id
and the number of unique process_id
values for each unique pair of event.code
and host.hostname
:
source clk:sysmon
| aggs latest(process_id), dc(process_id) by event.code, host.hostname
The result of the search query:
event.code | host.hostname | latest(process_id) | dc(process_id) |
---|---|---|---|
1 | host1 | 991580 | 128 |
7 | host2 | 996541 | 226 |
7 | host3 | 871234 | 98 |
12 | host1 | 965323 | 765 |
timeaggs
Searches and generates an array of data distributed along a timeline using internal storage algorithms. For more information, see timeaggs
.
- Don't use
.keyword
to access text fields - No
composite
parameter` - There are no milliseconds in the
_time
field - The right border of the time period is not included if there are no events in it
Examples
Example 1
The example will plot the number of events for each host.hostname value:
source clk:sysmon
| timeaggs count by host.hostname
Example 2
In the example, graphs will be plotted for the average and maximum load values for the CPU and free RAM:
source clk:zabbix
| timeaggs avg(value) as 'Average value', max(value) as 'Maximum value' by name
peval
Performs various operations on data. Uses internal storage algorithms. For more information, see peval
.
- An error will occur when accessing a non-existent field
- The values
true
andfalse
must be called astrue()
andfalse()
respectively - To write
null
to a field, it must have the Nullable property
Differences between functions and eval
The mvdedup
and exact
functions are not available in Clickhouse.
Function name | Features in Clickhouse. |
---|---|
mvzip | Works only with arrays of the same length. |
printf | Formatting is not available. |
coalesce | Only works for fields of the same type. |
match | You don't have to use a regular expression. |
tonumber | Not available for arrays. |
tostring | Can also be used for multivalue. |
mvappend | Cannot be used for different data types. |
strftime | The time format is different (for example %Y-%m-%d %R:%S ). |
strptime | No second argument needed. |
md5 | Can only be used for String . |
abs | Only numbers can be used. |
floor | Only numbers can be used. |
ceil | Only numbers can be used. |
log | You can only calculate logarithms in base 2 or 10 . |
tonumber | Converts to float64 . |
typeof | The type names are different: instead of Boolean - Bool , Integer - UInt8 , etc. |
len | Does not count the length of a number; to do this, you need to wrap the number in tostring() . |
ltrim | Trims only regular spaces, no second argument needed. |
replace | Doesn't work with regular expressions. |
rtrim | Similar to ltrim . |
substr | For a negative index, counts from 0 or includes the first boundary. |
trim | No second argument needed. |
Examples
Example 1
If the name
field is equal to "CPU utilization"
, then the type
field will be set to "CPU"
. Otherwise, the type
field will be set to "MEM"
:
source clk:zabbix
| peval type = if (name == "CPU utilization", "CPU", "MEM")
| table name, type
The result of the search query:
name | type |
---|---|
CPU utilization | CPU |
CPU utilization | CPU |
Available memory in % | MEM |
Example 2
In the example, the type
field will take a value similar to the first example, the value
field will be rounded to an integer. After this, events will be filtered, in which the final selection will include events whose type
field value is equal to CPU
, and value
is greater than 10
:
source clk:zabbix
| peval type = if (name == "CPU utilization", "CPU", "MEM")
| peval value = round(value)
| search type = "CPU" and value > 10
| table name, type, value
The result of the search query:
name | type | value |
---|---|---|
CPU utilization | CPU | 12 |
CPU utilization | CPU | 17 |
CPU utilization | CPU | 36 |
CPU utilization | CPU | 13 |