Skip to main content
Version: 5.2

lookup

Description

Allows retrieving data from a predefined lookup table. The command compares specified fields in the event and the lookup table. Upon a full match, the event will be enriched with the specified fields from the lookup table.

Syntax

lookup [dedup=<boolean>] [<execution-type>] [system=<boolean>] <lookup-name> ( <lookup-field> [AS <event-field>] )... [ OUTPUT | OUTPUTNEW (<lookup-destfield> [AS <event-destfield>] )... ]

Required Arguments

ParameterSyntaxDescription
lookup-name<lookup-name>The name of the predefined lookup.

Optional Arguments

ParameterSyntaxDefaultDescription
dedupdedup=<boolean>falseIndicates whether duplicates should be removed from the data compared to the lookup.
<execution-type>type=ELK [packsize=<integer>] | FASTDEFAULTSpecifies the execution type of the command. The ELK type indicates that the command is executed by Elasticsearch/OpenSearch, with a default packsize of 10,000. The FAST type represents an optimized, faster execution version of the command in DEFAULT mode.
systemsystem=<boolean>falseWhen set to true, a system lookup is performed; otherwise, a user lookup is used.
lookup-field<string>Several fields can be specified, separated by spaces.
event-field<string>The name of the field in the main result for comparison with the lookup-field. This is used when lookup-field and event-field have different names.
lookup-destfield<string>The name of the resulting field with obtained data.
OUTPUT | OUTPUTNEWOUTPUT | OUTPUTNEWOUTPUTThe OUTPUT indicator means that existing data in the source query field will be overwritten by the lookup data. The OUTPUTNEW indicator means that data will be overwritten only if the source query field is empty but has corresponding data in the lookup.
Configuration Recommendations
  1. Calling the command without specifying the execution type is recommended for small volumes of data in the lookup (up to 50,000 events). It is also not recommended to run the command this way when sme settings have max_initial_query_len greater than 10,000
  2. Calling the command with the FAST type is not recommended for lookups with more than 500,000 events
  3. For the ELK execution type, it's recommended not to set a high max_initial_query_len. The packsize parameter indicates how many events are processed in each iteration. For example, if max_initial_query_len = 10,000, and packsize = 5,000, then two requests are sent for a query of 10,000 events (max_initial_query_len/packsize). If max_initial_query_len is more than 20,000, it's recommended to set packsize to half or a third of max_initial_query_len

Working with Multivalue

info

Multivalue is the ability for an attribute to have multiple values instead of a single one. A multivalue can be thought of as a list of values. Examples of multivalue include a list of user roles.

When the lookup command is executed, both fields from search results (multivalue on the left) and fields from lookup tables (multivalue on the right) can be of the multivalue type. Thus, there are three cases where the lookup command involves multivalue:

  • multivalue on the left
  • multivalue on the right
  • multivalue on both left and right

Multivalue Value on the left

A value from the lookup table is matched with the search result if the value from the lookup table is contained within the multivalue value of the search result.

Left ValueRight ValueValue IntersectionMatch
[1, 2, 3, 4, 5]33Yes
[1, 2, 3, 4, 5]6-No

Multivalue Value on the Right

A value from the lookup table is matched with the search result if the search result is contained within the multivalue value from the lookup table.

Left ValueRight ValueValue IntersectionMatch
3[1, 2, 3, 4, 5]3Yes
6[1, 2, 3, 4, 5]-No

Multivalue Value on Both Left and Right

A value from the lookup table is matched with the search result if the intersection of the multivalue values on the left and right is not empty.

Left ValueRight ValueValue IntersectionMatch
[1, 2, 3, 4, 5][3][3]Yes
[1, 2, 3, 4, 5][1, 2, 5][1, 2, 5]Yes
[1, 2, 3, 4, 5][1, 2, 6][1, 2]Yes
[1, 2, 3][4, 5, 6]-No
[1, 2, 3][4]-No

Query Examples

Example 1

In the first example, for each incoming event, a match is performed between the database field from the event and the name field in each row of the lookup table. If these values match, the engine field will be added to the event, with its value taken from the corresponding row of the lookup table. If the event already had an engine field, it will be overwritten with the new value.

Sample input data tables-*:

_timedatabasetable_nameengine
2025-06-05 14:30:00hremployees
2025-06-05 14:31:00hrdepartmentsMyISAM
2025-06-05 14:32:00salesordersInnoDB

Lookup table my_click_db:

nametable_nameenginedescriptionversionrowssize.mb
hremployeesInnoDBEmployee data8.0.301001.2
hrdepartmentsInnoDBDepartment data8.0.30100.5
salesordersMyISAMOrders data5.7.4225000045.7
financetransactionsInnoDBFinancial transactions8.0.30500008.1
Using the OUTPUT Option

The OUTPUT option overwrites the engine field in events with values from the lookup table, even if engine is already populated.

Example №1
source tables
| search database = "hr"
| lookup my_click_db name as database OUTPUT engine

The query result may be the following table:

_timedatabasetable_nameenginedescriptionversionrowssize.mb
2025-06-05 14:30:00hremployeesInnoDBEmployee data8.0.301001.2
2025-06-05 14:31:00hrdepartmentsInnoDBDepartment data8.0.30100.5
Using the OUTPUTNEW Option and SOURCE Execution Mode

In this example, a similar match is performed in SOURCE mode. This means the matching will be executed by the storage engine. The packsize=5000 parameter indicates that queries to the storage will be sent in batches of 5000. The OUTPUTNEW option specifies that the engine field will not be overwritten if it already exists in the event.

Example №1
source tables
| search database = "hr"
| lookup type=SOURCE packsize=5000 my_click_db name as database OUTPUTNEW engine

The query result may be the following table:

_timedatabasetable_nameenginedescriptionversionrowssize.mb
2025-06-05 14:30:00hremployeesInnoDBEmployee data8.0.301001.2
2025-06-05 14:31:00hrdepartmentsMyISAMDepartment data8.0.30100.5

Example 2

Example №2
source security_incidents-*
| search vulnerabilities="CVE-*"
| lookup type=ERGO vulnerabilities_info cve_id AS vulnerabilities OUTPUT description, severity

In the second example, a match is performed between the vulnerabilities field from the event and the multivalue field cve_id in each row of the lookup table. The command is executed in ERGO mode, meaning the matching is optimized for RAM efficiency. The OUTPUT option indicates that the description and severity fields from the lookup table will overwrite or be added to the events, enriching them with vulnerability information.

Sample input data security_incidents-*:

_timehostvulnerabilitiesaffected_system
2025-06-05 14:30:00VLG-01CVE-2023-1234
CVE-2023-5678
WebServer
2025-06-05 14:32:00VLG-02CVE-2023-5678
CVE-2024-9012
Database

Lookup table vulnerabilities_info:

cve_iddescriptionseverity
CVE-2023-1234SQL Injection VulnerabilityHigh
CVE-2023-5678Cross-Site Scripting (XSS)Medium
CVE-2024-9012Remote Code ExecutionCritical

The query result may be the following table:

hostaffected_systemdescriptionseverity
VLG-01WebServerSQL Injection VulnerabilityHigh
VLG-01WebServerCross-Site Scripting (XSS)Medium
VLG-02DatabaseCross-Site Scripting (XSS)Medium
VLG-02DatabaseRemote Code ExecutionCritical
Using the dedup Argument

The lookup command enriches events with vulnerability information description and severity, excluding duplicates in the vulnerabilities field before matching thanks to the dedup=true option.

Example №2 with dedup
source security_incidents-*
| search vulnerabilities="CVE-*"
| lookup type=ERGO vulnerabilities_info cve_id AS vulnerabilities OUTPUT description, severity

Sample input data security_incidents-* with duplicate CVE-2023-5678 values in the first record where affected_system="WebServer", and in the second record where affected_system="Database":

_timehostvulnerabilitiesaffected_system
2025-06-05 14:30:00VLG-01CVE-2023-1234
CVE-2023-5678
CVE-2023-5678
WebServer
2025-06-05 14:32:00VLG-02CVE-2023-5678
CVE-2024-9012
CVE-2023-5678
Database

In this case, the query result will exclude duplicate CVEs, as shown at the beginning of Example 2:

hostaffected_systemdescriptionseverity
VLG-01WebServerSQL Injection VulnerabilityHigh
VLG-01WebServerCross-Site Scripting (XSS)Medium
VLG-02DatabaseCross-Site Scripting (XSS)Medium
VLG-02DatabaseRemote Code ExecutionCritical

Example 3

In the third example, a match is performed between the multivalue field malware_description from the event and the multivalue field description in each row of the lookup table. The lookup command enriches events with information about the malware type malware_type based on partial or complete intersection of signature lists.

Sample input data security_alerts-*:

_timehostmalware_descriptionfilemalware_type
2025-06-05 14:30:00VLG-01W32.File.Mal
W32.2C2D
W32.Auto.A
mssecsvc.exe
2025-06-05 14:30:00VLG-02W32.ED01
W32.Overdrive
OLD.exeVirus
2025-06-05 14:30:00VLG-03W32.ED01
W32.Generic
Fax.exeWorm

Lookup table malware_info:

descriptiontype
W32.File.Mal
W32.2C2D
W32.Auto.A
Ransomware
W32.ED01
W32.44D7
Worm
Using the OUTPUT Option

The OUTPUT option overwrites the malware_type field in events with values from the lookup table for all events where there is a non-empty intersection between the event's malware_description and the lookup table's description, regardless of whether the malware_type field was initially populated.

Example №3 with option OUTPUT
source security_alerts-*
| search malware_info="W32*"
| lookup malware_info description AS malware_description OUTPUT type as malware_type

The query result may be the following table:

_timehostmalware_descriptionfilemalware_type
2025-06-05 14:30:00VLG-01W32.File.Mal
W32.2C2D
W32.Auto.A
mssecsvc.exeRansomware
2025-06-05 14:30:00VLG-02W32.ED01
W32.Overdrive
OLD.exeWorm
2025-06-05 14:30:00VLG-03W32.ED01
W32.Generic
Fax.exeWorm
Using the OUTPUTNEW Option

OUTPUTNEW adds type values from the lookup table only for events where the malware_type field is missing or empty. If malware_type is already populated, it is preserved.

Example №3 with option OUTPUTNEW
source security_alerts-*
| search malware_info="W32*"
| lookup malware_info malware AS malware_description AS malware_description OUTPUTNEW type AS malware_type

The query result may be the following table:

_timehostmalware_descriptionfilemalware_type
2025-06-05 14:30:00VLG-01W32.File.Mal
W32.2C2D
W32.Auto.A
mssecsvc.exeRansomware
2025-06-05 14:30:00VLG-02W32.ED01
W32.Overdrive
OLD.exeVirus
2025-06-05 14:30:00VLG-03W32.ED01
W32.Generic
Fax.exeWorm