Graylog: Manage retention

How to configure log retention

👋 Welcome to the Stackhero documentation!

Stackhero offers a ready-to-use Graylog cloud solution that provides a host of benefits, including:

  • Unlimited and dedicated SMTP email server included.
  • Effortless updates with just a click.
  • Customizable domain name secured with HTTPS (for example, https://logs.your-company.com).
  • Optimal performance and robust security powered by a private and dedicated VM.

Save time and simplify your life: it only takes 5 minutes to try Stackhero's Graylog cloud hosting solution!

Retention determines the number of messages stored in the OpenSearch database. You can configure retention based on the number of messages, a maximum age, or an overall size limit.

For example, you might choose to keep messages from the past 365 days, retain up to 200 million messages, or allocate a total of 400 GB of storage space.

Before setting your retention policy, it is important to understand how indices used by Graylog and OpenSearch work. Think of indices as physical containers. Graylog "opens" a container (an index) and places incoming messages inside it. When the quota assigned to that container is exceeded, the container is closed, stored on a shelf, and a new container is opened for subsequent messages.

You can set this quota using different criteria:

  1. A number of messages: "Keep 20 million messages per container, then start a new one."
  2. A time limit: "Use a container for 10 days, then switch to a new one."
  3. A size limit: "Store 20 GB per container, then move on to the next one."

A maximum number of containers that can be stored on the shelf is also defined. If this number is exceeded, the oldest containers are automatically deleted. For example, if you set a maximum of 20 containers and have 22 on the shelf, the 2 oldest containers will be removed.

In this analogy, the containers represent the indices, the shelf is OpenSearch, and the maximum number represents the permitted number of indices.

Graylog offers three retention strategies:

  1. "Index time" defines the maximum duration for which messages are kept in each index, for example, 14 days per index.
  2. "Index message count" sets the maximum number of messages per index, for example, 20 million messages per index.
  3. "Index size" limits the maximum size of an index, for example, 40 GB per index.

You can choose one of these strategies based on your specific needs. For example, selecting "Index time" ensures that you always have logs from the past X days.

Be sure to accurately estimate your disk space requirements.

For example, if you store 1 GB of logs per day and decide to keep logs for the past 365 days, you will need 365 GB of disk space. Remember to also reserve extra space for operations (see below).

By default, Graylog limits the number of indices to 20. You can adjust this value to fit your needs. For example, if you want to keep logs from the past 365 days, you could distribute retention across indices by dividing 365 days by 20 indices, which gives about 19 days per index.

You can do similar calculations for the other strategies:

  1. For the "Index message count" strategy: if you want to keep 200 million messages with a maximum of 20 indices, then 200 million messages divided by 20 indices gives 10 million messages per index.
  2. For the "Index size" strategy: if you want to keep 400 GB of logs with a maximum of 10 indices, then 400 GB divided by 10 indices gives 40 GB per index.

We recommend always keeping at least 15 GB of free disk space for logs, Graylog's journal, and MongoDB data.

If available disk space runs out, OpenSearch will block its operations and you will need to upgrade to a larger instance.

To configure the retention policy, go to the Graylog interface. Under the "System" menu, select "Indices" and click the "Edit" button in the "Default index set" section.

In the example below, the configuration sets a maximum of 27 indices, with each index retaining 14 days of logs. This setup allows you to keep logs for about a year (378 days).

We do not recommend keeping more than 14 days of messages per index.

Retention configuration to keep logs for a yearRetention configuration to keep logs for a year

When you choose "Index time" as a rotation policy, you must define the duration using the ISO8601 Duration standard.

For example, "P7D" means 7 days, "P14D" means 14 days, and so on.

If you want to learn more about indices, we strongly encourage you to consult the official documentation.

Sometimes, OpenSearch may switch to read-only mode and you may encounter errors such as:

  1. "Flood stage disk watermark exceeded, all indices on this node will be marked read-only"
  2. "FORBIDDEN/12/index read-only / allow delete (api)"

These errors occur as part of OpenSearch's protection mechanism when disk space becomes critically low. When available disk space drops below 7 GB, OpenSearch sets indices to read-only as a precaution to prevent data corruption.

If you encounter these errors, you have two options:

  1. Reconfigure your retention policy to keep fewer logs. After adjusting the policy, delete the oldest index to free up disk space and allow OpenSearch to return to read-write mode. Please note that deleting an index will permanently erase all data it contains.
  2. Upgrade to an instance with a larger disk. With a single click from your Stackhero dashboard, the instance will restart with more disk space and OpenSearch will automatically return to read-write mode.