aboutsummaryrefslogtreecommitdiffstats
path: root/doc/check-health.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/check-health.md')
-rw-r--r--doc/check-health.md96
1 files changed, 96 insertions, 0 deletions
diff --git a/doc/check-health.md b/doc/check-health.md
new file mode 100644
index 0000000..f94a0bf
--- /dev/null
+++ b/doc/check-health.md
@@ -0,0 +1,96 @@
+Notify about health state
+=========================
+
+[![GitHub stars](https://img.shields.io/github/stars/eworm-de/routeros-scripts?logo=GitHub&style=flat&color=red)](https://github.com/eworm-de/routeros-scripts/stargazers)
+[![GitHub forks](https://img.shields.io/github/forks/eworm-de/routeros-scripts?logo=GitHub&style=flat&color=green)](https://github.com/eworm-de/routeros-scripts/network)
+[![GitHub watchers](https://img.shields.io/github/watchers/eworm-de/routeros-scripts?logo=GitHub&style=flat&color=blue)](https://github.com/eworm-de/routeros-scripts/watchers)
+[![required RouterOS version](https://img.shields.io/badge/RouterOS-7.13-yellow?style=flat)](https://mikrotik.com/download/changelogs/)
+[![Telegram group @routeros_scripts](https://img.shields.io/badge/Telegram-%40routeros__scripts-%2326A5E4?logo=telegram&style=flat)](https://t.me/routeros_scripts)
+[![donate with PayPal](https://img.shields.io/badge/Like_it%3F-Donate!-orange?logo=githubsponsors&logoColor=orange&style=flat)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=A4ZXBD6YS2W8J)
+
+[⬅️ Go back to main README](../README.md)
+
+> ℹ️ **Info**: This script can not be used on its own but requires the base
+> installation. See [main README](../README.md) for details.
+
+Description
+-----------
+
+This script is run from scheduler periodically, sending notification on
+health related events:
+
+* high CPU utilization
+* high RAM utilization (low available RAM)
+* voltage jumps up or down more than configured threshold
+* voltage drops below hard lower limit
+* power supply failed or recovered
+* temperature is above or below threshold
+
+Note that bad initial state will not trigger an event.
+
+Monitoring CPU and RAM utilization (available processing and memory
+resources) works on all devices. Other than that only sensors available
+in hardware can be checked. See what your hardware supports:
+
+ /system/health/print;
+
+### Sample notifications
+
+#### CPU utilization
+
+![check-health notification cpu utilization high](check-health.d/notification-01-cpu-utilization-high.avif)
+![check-health notification cpu utilization ok](check-health.d/notification-02-cpu-utilization-ok.avif)
+
+#### RAM utilization (low available RAM)
+
+![check-health notification ram utilization high](check-health.d/notification-03-ram-utilization-high.avif)
+![check-health notification ram utilization ok](check-health.d/notification-04-ram-utilization-ok.avif)
+
+#### Voltage
+
+![check-health notification voltage](check-health.d/notification-05-voltage.avif)
+
+#### Temperature
+
+![check-health notification temperature high](check-health.d/notification-06-temperature-high.avif)
+![check-health notification temperature ok](check-health.d/notification-07-temperature-ok.avif)
+
+#### PSU state
+
+![check-health notification psu fail](check-health.d/notification-08-psu-fail.avif)
+![check-health notification psu ok](check-health.d/notification-09-psu-ok.avif)
+
+Requirements and installation
+-----------------------------
+
+Just install the script and create a scheduler:
+
+ $ScriptInstallUpdate check-health;
+ /system/scheduler/add interval=53s name=check-health on-event="/system/script/run check-health;" start-time=startup;
+
+> ℹ️ **Info**: Running lots of scripts simultaneously can tamper the
+> precision of cpu utilization, escpecially on devices with limited
+> resources. Thus an unusual interval is used here.
+
+Configuration
+-------------
+
+The configuration goes to `global-config-overlay`, these are the parameters:
+
+* `CheckHealthTemperature`: an array specifying temperature thresholds for sensors
+* `CheckHealthVoltageLow`: value (in volt*10) giving a hard lower limit
+* `CheckHealthVoltagePercent`: percentage value to trigger voltage jumps
+
+> ℹ️ **Info**: Copy relevant configuration from
+> [`global-config`](../global-config.rsc) (the one without `-overlay`) to
+> your local `global-config-overlay` and modify it to your specific needs.
+
+Also notification settings are required for
+[e-mail](mod/notification-email.md),
+[matrix](mod/notification-matrix.md),
+[ntfy](mod/notification-ntfy.md) and/or
+[telegram](mod/notification-telegram.md).
+
+---
+[⬅️ Go back to main README](../README.md)
+[⬆️ Go back to top](#top)