Prometheus: 配置 Prometheus 警报规则

本文件是警报指南的一部分。您可以在这里查看完整指南：Prometheus 警报的工作原理及其配置方法。

👋 欢迎来到 Stackhero 文档！

Stackhero 提供即用型 Prometheus 云解决方案，具有多种优势，包括：

包含 Alert Manager，可发送警报到 Slack、Mattermost、PagerDuty 等。

专用邮件服务器发送无限制邮件警报。

Blackbox 用于探测 HTTP、ICMP、TCP 等。

使用在线配置文件编辑器进行轻松配置。

只需点击即可轻松更新。

由专用私有 VM提供的最佳性能和强大安全性。

节省时间，简化生活：只需 5 分钟即可试用 Stackhero 的 Prometheus 云托管解决方案！

您可以通过编辑 rules-alert.yml 文件来调整 Prometheus 警报规则。为此，请访问您的 Stackhero 仪表板，选择您的 Prometheus 服务，然后点击“Prometheus 警报规则配置”。

我们已经在您的 Stackhero for Prometheus 实例中添加了一些默认警报规则，因此通常不需要修改 rules-alert.yml 文件，除非需要自定义。

以下是一个示例，当磁盘使用率超过 90% 时触发警报：

- alert: "HostOutOfDiskSpace"
  expr: (node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes < 10 and ON (instance, device, mountpoint) node_filesystem_readonly == 0
  for: 2m
  labels:
    severity: "warning"
  annotations:
    summary: "Host out of disk space (instance {{ $labels.instance }})"
    description: "Disk is almost full (< 10% left)"
    value: "{{ $value }}"

这是另一个示例，预测在接下来的 24 小时内磁盘空间可能会饱和：

- alert: "HostDiskWillFillIn24Hours"
  expr: (node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes < 10 and ON (instance, device, mountpoint) predict_linear(node_filesystem_avail_bytes{fstype!~"tmpfs"}[1h], 24 * 3600) < 0 and ON (instance, device, mountpoint) node_filesystem_readonly == 0
  for: 2m
  labels:
    severity: "warning"
  annotations:
    summary: "Host disk will fill in 24 hours (instance {{ $labels.instance }})"
    description: "Filesystem is predicted to run out of space within the next 24 hours at the current write rate"
    value: "{{ $value }}"

您可以在 Awesome Prometheus Alerts 网站上找到更多警报规则示例。

Prometheus: 配置 Prometheus 警报规则

👋 欢迎来到 Stackhero 文档！

继续阅读本指南

Prometheus 的其他文章