Skip to main content

Migration to ECS

ECS (Elastic Common Schema) is a specification developed by the Elastic user community. ECS defines a common set of fields when storing event data in OpenSearch.

Using ECS allows you to:

  • normalize your event data for subsequent analysis and visualization of this data presented in the events
  • unify data to reduce labor costs in the future (maximizing compatibility and reuse, for example, using developed dashboards from the outside without additional adaptation)
  • acceleration of OpenSearch work in general due to the correct distribution of data types to each field

For more information, see the official documentation. Everything described below is relevant for version 8.6.0.

Please note!

This guide covers ECS version 8.6.0. In Logstash 8.x version, ECS is enabled by default and corresponds to version v8, but you can disable this feature if necessary. When using version 7.x, ECS is disabled by default, and v1 is allowed. When updating to the current version, it is not recommended to set v1 mode, since there may be a field naming conflict that is built into the v8 version, in which case the Logstash pipeline will not start.

ECS field levels

In ECS, the fields can be divided into:

  • basic (core fields) - refers to fields that are suitable for any data source. For example, @timestamp, tags, message and others. First of all, it is necessary to fill in these fields.
  • additional - the remaining fields apply, are more often used in narrower variants or may be open in interpretation. There is a possibility that it may be changed in the future

ECS Agreements and Rules

General agreements

  • For the integer data type (integers) within the ECS, use the long type
  • use the keyword type for ID and code fields (for example, error code)
  • for a text field or multi-field
  • default convention for Elasticsearch for text fields, the text field is indexed twice:
  • the convention for ECS changes the approach, almost all text fields are of the keyword type with some exceptions. If a full-text search is required, it is allowed to add a multi-field (multi-field)
  • exception fields that must be indexed for full-text search

Implementation patterns

  • Base fields - base fields, a group of individual top-level fields outside the set
  • Host - contains various fields related to the device where the event occurred, can be a computer, virtual machine, container or cloud server
  • Agent and observer - agent and observer. An agent is software that collects, observes, measures, or detects an event. An example of an agent is ElasticBeat. An observer is an external monitoring or intermediary device, for example, firewall, APM server, web proxy.
  • Timestamp - each event must have a timestamp, some events also have additional timestamps following the chronological order @timestamp < event.created < event.ingested:
  • Origin - specific fields for fixing the place of occurrence of the event
  • Categorization - classification of ECS fields (see below for details)
  • Enriching events - filling events with additional information, there are many such fields in ECS that can be added to an event
  • Lookups
  • Parsing
  • Related fields - related fields. Many events have different fields with identical contents, for example, IP address, hash file, hostname. To access such fields, use related.* fields, for example, if there is an IP for some event in the fields host.*, source.*, destination.*, add all addresses to the related.ip field for convenience of further search

Processing network events

Here are recommendations for handling network-related events. For more information, see the official documentation page.

  • Checking for the source and destination - if the event contains information about the source and the receiving party, then these fields (source.* and destination.*) are filled in first. Some events may indicate the roles of each host (client and server), they should be filled in in addition to the source and destination fields, i.e. fields from source.*/destination.* must be copied to client.*/server.*. This is important to fill in for further filtering by roles or by source/destination.
  • Fill in the related fields, add all IP addresses to the related.ip field as an array, host names may also occur in network events, they also need to be copied to related.hosts
  • Fill in the categorization fields that best characterize the event
  • The combination of the categorization fields event.category: network and event.type: protocol also requires filling in the network.protocol field
  • if desired, the original fields can be saved as custom fields, for example, for NGINX - Nginx.*

ECS Design Principles

  • General scheme - the main purpose of ECS is to maximize compatibility and reuse
  • Fields set up sets of names - try to break down complex concepts into simpler ones, for example, dns.question.class , dns.question.answer, dns.question.type
  • Consistency of the name - do not limit terms with a broad meaning to one case, for example:
  • Reuse
  • Custom fields - in many situations, you will need to create custom fields to fully describe the event, but add them only if there is no corresponding field in the ECS concept

Recommendations for field names

If you have not found the corresponding field in ECS, then you can use any field, but it must comply with the following rules:

  • the field name must be in lowercase (except for custom)
  • use underscores to combine words (for example, requests_per_sec)
  • do not use any special characters other than underscores
  • use the present tense if the field does not describe an already accomplished fact
  • use the singular and plural correctly to reflect the contents of the field (for example, requests_per_sec or request_per_sec)
  • use prefixes for all fields, i.e. all fields related to the host must be placed in hosts.xxx
  • for the fields themselves, use JSON objects, i.e. use a "dot" (read more on the official documentation page)
  • organize fields from general to private
  • do not use repetitions of words (for example, this is not correct: host.host_ip, it is better to use host.ip)
  • avoid abbreviations except for common ones (for example, ip, geo, etc.)
  • for a custom field, if no match is found in the predefined fields, it is recommended to write the name of the service with a capital letter (we have chosen this method to separate the predefined ECS and custom fields) and place all other fields in it. For example, Nginx.source.ip, Jitsi.ip. If you ignore this rule, then in the future there may be a situation when a field in ECS will be reserved and its name will match the name of your custom field

ECS Classification fields

ECS provides 4 levels of categorization for a general description, regardless of the event source. First, consider the basic principles of classification:

  • similar events that can be viewed and analyzed together should fall into the same event.category field
  • event.category and event.types are arrays, so try to assign the event to all relevant categories
  • only the predefined values described below should be used
  • event.outcome is very limited for using the result of an event, domain-specific actions, for example, prohibition or permission, which can be considered results, must be recorded in the event fields.type and/or event.action, for example, an access blocking event, from the side of the firewall, the event is successful (event.outcome: success), but the result of execution itself is discarded (event.action: dropped)
  • the values of event.category, event.type, event.outcome are the same for all event values.kind
  • if a specific event does not match any of the predefined classification values, the field should be left empty

Let's look at the 4 levels of classification, presented in the form of a field, in more detail.

event.kind - the highest level of event categorization

This field provides information about what type of information the event contains, without explicitly specifying the content of this event. The value of this field can be used to inform how events of this kind should be handled. Must contain only one of the values:

  • alert - an event responding to some action from outside, for example, firewall events, discovery of a new host, user login, etc.
  • enrichment - events containing additional content often to an already existing event
  • event - used for events indicating that something has happened
  • metric - is specified to describe the numerical measurement performed at a given time. For example, CPU usage, memory usage, etc. They are often collected at a certain frequency, for example, once a minute.
  • state - similar to a metric event, it is also collected at a certain interval, but contains one of the fixed values, for example, closed, opened. But note that the state change event itself should be described in the event, since the state change corresponds to the more general definition of the event
  • pipeline_error - indicates that an error occurred while receiving the event, the event data may be missing or incorrect. It is often associated with parsing errors
  • signal is a reserved field value used by Elastic solutions, for example, security, observability for warning events created by rules in the Kibana notification environment.

event.category - the second level of event categorization

Filtering by this field allows you to classify the initial events by consistent values. This field is an array, the values of which can be used to configure parsing for various sources. The value is closely related to the third level of categorization. Must be one of the following values:

  • authentication - events related to the request and response in which credentials are provided and verified, for example, Windows log or ssh log events. Expected category types for the third level of categorization: start, end, info
  • configuration - configuration-related events - creation, modification, deletion of process, application, and system parameters. For example, logs of security policy changes, configuration audits, and system integrity monitoring. Expected category types for the third level of categorization: access, change, creation, deletion, info
  • database - events related to data storage and retrieval systems, and not only relational ones. For example, MSSQL, PostgreSQL logs, etc. Expected category types for the third level of categorization: access, change, info, error
  • driver - events related to OS device drivers and similar software products, for example, Windows drivers, kernel extensions, kernel modules, etc. Expected category types for the third level of categorization: change, end, info, start
  • email - events related to email: messages, attachments and events, network activity, protocols used. For example, messages from the mail firewall, messages from the cloud mail service, etc. Expected category types for the third level of categorization: info
  • file - events related to the created or existing file system are used to visualize and analyze the creation, access and deletion of files, including network sources, such as the transfer of Zeek file.log files. Expected category types for the third level of categorization: change, creation, deletion, info
  • host - events for visualization or analysis of host inventory information or host lifecycle events, for example, these are mainly events observed from outside, start and end events can be seen on the host itself. Please note that events in this category are for information about the hosts themselves and are not intended to track activity occurring on the host itself. Expected category types for the third level of categorization: access, change, end, info, start
  • iam - Identity and Access Management (IAM) events related to users, groups, and administration, for example, for visualization and analysis of Active Directory, LDAP, Okta, Duo, and other IAMS. Expected category types for the third level of categorization: admin, change, creation, deletion, group, info, user
  • intrusion_detection - intrusion detection events from IDS/IPS systems and functions, for example, visualization and analysis of intrusion detection warnings from Snort, Suricata, Palo Alto, etc. Expected category types for the third level of categorization: allowed, denied, info
  • malware - malware detection events, for example, Kaspersky anti-virus, Dr. Web, Elastic Endpoint Security, etc. Expected category types for the third level of categorization: info
  • network - events related to any network activity, including the lifecycle of a network connection, network traffic, and almost any event involving an IP address, for example, are used to visualize and analyze the number of network ports, protocols, addresses, and geolocation. Expected category types for the third level of categorization: access, allowed, connection, denied, end, info, protocol, start
  • package - events related to software packages installed on hosts can also be used to determine the vulnerability of a host in the absence of vulnerability scanning data. Expected category types for the third level of categorization: access, change, deletion, info, installation, start
  • process - events related to the visualization and analysis of information about the process, for example, life cycle events and the origin of the process. Expected category types for the third level of categorization: access, change, end, info, start
  • registry - events related to settings and assets stored in the Windows registry, for example, access to the registry and its changes. Expected category types for the third level of categorization: access, change, creation, deletion
  • session - events and metrics related to logical persistent connections to nodes and services, for example, data from Windows events, ssh, or sessions without saving HTTP cookie states, etc. can be included here. Expected category types for the third level of categorization: start, end, info
  • threat events for visualization and analysis of events describing the goals and motives or behavior of the threat subjects. Expected category types for the third level of categorization: indicator
  • vulnerability - vulnerability scan results events., for example, Tenable, Qualys, vulnerability scanners, etc. Expected category types for the third level of categorization: info
  • web access events to the web server, used to analyze the activity of the web server, proxies such as IIS, nginx, Apache, etc. Expected category types for the third level of categorization: access, error, info

event.type is the third level of event categorization

It is an array of values that, when combined with the event.category field, will allow you to classify events that may relate to different types of events. Allows you to categorize and filter events for a single visualization. It is also an array of values like event.category. Must be one of the following values:

  • access - access to something, for additional separation of events, you can use the event.action field
  • admin - type of administrative event, used for events related to administrative objects, for example, changes in the IAM structure that do not affect the user or group, for additional separation of events, you can use the event.action field
  • allowed - events indicating that something was allowed. For additional separation of events, you can use the event.action field
  • change - events indicating that something has been changed. For additional separation of events, you can use the event.action field
  • connection - network traffic events involving flow or connection analysis. Events must include at least the source and destination IP addresses, TCP/UDP ports, and contain the number of bytes and packets transmitted. Note that next-generation firewall events also fall into this category. For additional separation of events, you can use the event.action field
  • creation - events indicating that something has been created
  • deletion - events indicating that something has been deleted
  • denied - events indicating that something has been rejected, for example, firewall events. For additional separation of events, you can use the event.action field
  • end - events indicating that something has ended
  • error - events describing the error. Please note that for errors in the pipeline itself, it is worth using event.kind: pipeline_error
  • group - events related to group objects. For additional separation of events, you can use the event.action field
  • indicator - events of the compromise indicator (IOCs)
  • info - information events that do not carry information about state changes or actions
  • installation - events indicating that something has been installed
  • protocol - events information about the protocol or analysis. Note that events containing only the protocol name or its identifier should not use this type.
  • start - events indicating that something has started
  • user - events related to user objects. For additional separation of events, you can use the event.action field

event.outcome - the lower level of event categorization

This field can be used as a flag describing success or failure on the part of the object that triggered the event. It is worth noting that when using event chains to describe a single transaction, each event may have a different value of this field on the part of the object. Also, in the case of a composite event, it is better to use the final result for each event. This field ** is not filled in** for metric events, information events, as well as events for which the result does not make logical sense. Must take one of the following values:

  • failure
  • success
  • unknown - indicates an event, the result of which is not known from the point of view of the event producer. For example, if the event contains only information about the responding party to the transaction request. You should not use this value when the result does not make logical sense - in such cases, this field should not be filled in.

ECS Field Reference

As mentioned above, the fields are classified into basic and extended. The Field Reference defines several "groups", which are called sets of fields, for example, Base, Agent, Device, DNS, etc. We will talk about expanded fields. You can get acquainted with the set of fields in the official документации.

In each event, there can be several objects (sets of fields) and basic fields in the root

Groups of related fields are defined in ECS for different data types. All sets of fields are defined as objects in Elasticsearch, within which all fields are defined, with the exception of the Base fields, which are located at the root of the event. Let's look at an example https://www.elastic.co/guide/en/ecs/current/ecs-http.html. Let's look at a couple of fields, you can get acquainted with the rest using the specified link.

This example describes the HTTP request fields. In this case, the http object is in Elasticsearch, and the rest is a specific set of ECS fields.

That is, for example, an http ES object, request.body.bytes is the naming of the field that contains the specified data.

Migration to ECS

There are two approaches:

  • You can use tools that already support ECS implementation, for example, ElasticBeats.
  • Manually perform field mapping

Consider the manual method:

  1. View each field of the original event and match it with the corresponding ECS field
  2. Review each core core field and try to fill it in
  3. Look at other extended fields from other sets that you are already using, try to fill them in
  4. Set the value of the ecs field.version of the version you are using
  5. Use the spreadsheet to plan migration from existing source fields to ECS More information about migration can be found on the official documentation page.

Minimum set of fields for each event

Any event must include the following fields:

Field nameExample valueField typePossible valuesExplanation
@timestamp2023-05-23T08:05:34.853ZdateTime of event receipt
messageMessage errormatch_only_textThe original text of the event
ecs.version8.6.0keywordIt is required to specify which version of ECS to adhere to when processing an event, fields or their names may differ depending on the version
event.modulenginxkeywordMust include the name of the module or service that generates the event
event.datasetnginx.accesskeywordThe name of the dataset. If the source generates more than one type of logs or events, then the type must be separated by a dot. For example, nginx generates at least two log files - access and errors, then for each log file this field will contain either nginx.access or nginx.error
event.kindeventkeywordalert, enrichment, event, metric, state,pipeline_error, signalSpecifies the type of event information without specifying the content. In fact, it answers the question "how should the event be handled?". For example, you can use this field to distinguish a notification event from metrics.
event.category["network", "web"]keywordauthentication, configuration, database, driver, email, file, host, iam, intrusion_detection, malware, network, package, process, registry, session, threat, vulnerability, webarray, event category
event.type["access"]keywordaccess, admin, allowed, change, connection, creation, deletion, denied, end, error, group, indicator, info, installation, protocol, start, userarray, subcategory event.category and closely related values
event.outcomesuccesskeywordfailure, success, unknownevent result, in some cases the field is not filled in