Graylog: Data mapping issues

How to resolve Graylog index data mapping problems

👋 Welcome to the Stackhero documentation!

Stackhero offers a ready-to-use Graylog cloud solution that provides a host of benefits, including:

  • Unlimited and dedicated SMTP email server included.
  • Effortless updates with just a click.
  • Customisable domain name secured with HTTPS (for example, https://logs.your-company.com).
  • Optimal performance and robust security powered by a private and dedicated VM.

Save time and simplify your life: it only takes 5 minutes to try Stackhero's Graylog cloud hosting solution!

A frequent issue in Graylog involves data mapping conflicts, which can result in failed indexing attempts. You may encounter this problem if you see logs similar to the following:

ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [level] of type [long] in document with id '34fd4d11-36ed-11f0-afc9-0242ac140002'. Preview of field's value: 'error']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "error"]];

This problem stems from OpenSearch's dynamic mapping feature. Dynamic mapping automatically determines the data type of each field based on the first document written to an index. Once this type is set, it is "locked in", and any subsequent document containing a different data type for that field will be rejected, resulting in a mapper parsing exception.

When a new index is created, the first document defines the index mapping. For example, if this document contains a "level" field with the value 3 (a numeric value), OpenSearch sets the data type for "level" to "long" (a numeric type). If a later document sent to Graylog contains the "level" field with the value "error" (a string type), it will be rejected because the data type does not match the one initially defined. This triggers a mapper_parsing_exception error with the reason failed to parse field [level] of type [long] in document with id 'xxx'.

This issue can occur with any field if data types are inconsistent across documents.

To resolve this issue, you have two options:

The ideal solution is to standardise the data types used for fields across all systems sending data to Graylog. For example, ensure that the "level" field is always sent either as a string (such as "error", "warn", etc.) or always as a number (3, 4, etc.). This consistency prevents mapping conflicts and ensures all documents are ingested correctly.

If standardising data types across all systems is not possible, you can use Graylog's pipelines to convert data types upon receipt. Pipelines allow you to define rules that transform data according to specific conditions.

To implement this solution:

  • Go to "System" > "Pipelines" in the Graylog web interface.
  • Click "Add new pipeline" to create a new pipeline.
  • Define rules to convert the "level" field (or other fields) to the desired data type. For example, you can convert numeric levels to their corresponding string representations (such as 3 to "error", 4 to "warning", etc.).

This approach ensures that all incoming data conforms to the expected data types, thereby preventing mapping conflicts.

For advanced users, Graylog allows you to view and manually adjust index mappings:

  • Go to "System" > "Indices" in the Graylog web interface.
  • Select the relevant index.
  • Navigate to "Configuration" > "Configure index field types" to view or modify field mappings.

However, any manual changes should be made with caution, as incorrect mappings can lead to further ingestion issues.