Graylog: Data mapping issues

How to resolve Graylog index data mapping problems

👋 Welcome to the Stackhero documentation!

Stackhero offers a ready-to-use Graylog cloud solution that provides a host of benefits, including:

  • Unlimited and dedicated SMTP email server included.
  • Effortless updates with just a click.
  • Customizable domain name secured with HTTPS (for example, https://logs.your-company.com).
  • Optimal performance and robust security powered by a private and dedicated VM.

Save time and simplify your life: it only takes 5 minutes to try Stackhero's Graylog cloud hosting solution!

A frequent issue in Graylog involves data mapping conflicts, which can result in failed indexing attempts. You may encounter this problem if you see logs similar to the following:

ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [level] of type [long] in document with id '34fd4d11-36ed-11f0-afc9-0242ac140002'. Preview of field's value: 'error']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=For input string: "error"]];

This issue is caused by OpenSearch's dynamic mapping feature. Dynamic mapping automatically determines the data type of each field based on the first document written to an index. Once this data type is set, it becomes "locked in", and any subsequent document containing a different data type for that field will be rejected, resulting in a mapper parsing exception.

When a new index is created, the first document defines the index mapping. For example, if the document contains a "level" field with a value of 3 (a numeric value), OpenSearch sets the data type for "level" to "long" (a numeric type). If a later document sent to Graylog contains the "level" field with the value "error" (a string), it will be rejected because the data type does not match the one initially set. This triggers a mapper_parsing_exception error with the reason failed to parse field [level] of type [long] in document with id 'xxx'.

This issue can occur with any field if data types are inconsistent across documents.

To resolve this issue, you have two options:

The ideal solution is to standardize the data types used for fields across all systems sending data to Graylog. For example, make sure the "level" field is always sent as either a string (such as "error", "warn", etc.) or always as a number (3, 4, etc.). This consistency prevents mapping conflicts and ensures all documents are ingested correctly.

If standardizing data types across all systems is not possible, you can use Graylog's pipelines to convert data types upon receipt. Pipelines allow you to define rules that transform data based on specific conditions.

To implement this solution:

  • Go to "System" > "Pipelines" in the Graylog web interface.
  • Click "Add new pipeline" to create a new pipeline.
  • Define rules to convert the "level" field (or other fields) to the desired data type. For example, you can convert numeric levels to their corresponding string representations (such as 3 to "error", 4 to "warning", etc.).

This approach ensures that all incoming data matches the expected data types, preventing mapping conflicts.

For advanced users, Graylog allows you to view and manually adjust index mappings:

  • Go to "System" > "Indices" in the Graylog web interface.
  • Select the relevant index.
  • Navigate to "Configuration" > "Configure index field types" to view or modify the field mappings.

However, manual changes should be made with caution, as incorrect mappings can lead to further ingestion issues.