Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix false critical on OMD backup job when agent runs at the time the backup is about to start #711

Closed
wants to merge 1 commit into from

Conversation

dnlldl
Copy link
Contributor

@dnlldl dnlldl commented Jun 20, 2024

Prevent this false critical alert:

Host
Service OMD backup
Event OK → CRITICAL
Time Mon Oct 23 01:30:05 EDT 2023
Summary Backup completed, it was running for 2 minutes 4 seconds from 2023-10-16 01:30:03 till 2023-10-16 01:32:06, Size: 426 MiB, Next run: 2023-10-23 01:30:00CRIT
Details Backup completed, it was running for 2 minutes 4 seconds from 2023-10-16 01:30:03 till 2023-10-16 01:32:06Size: 426 MiBNext run: 2023-10-23 01:30:00CRIT
Host Metrics rta=0.010ms;200.000;500.000;0; pl=0%;80;100;; rtmax=0.038ms;;;; rtmin=0.002ms;;;;
Service Metrics backup_duration=123.582501;;;; backup_avgspeed=865828.190744;;;; backup_size=446827456;;;;

Basically, this happens when the backup is about to start (here at 01:30:00) but hasn't started yet when the agent checked (around 01:30:00 also in this case but the alert was generated at 01:30:05). In the logs, the backup actually started at 01:30:03, it's normal for a cron job to sometimes have a very small discrepancy, add to that the discrepancy between the check and the time of the alert reported by Checkmk and we get a false critical in this case. The 30 seconds buffer will prevent this corner case from every happening again. I'm aware 30 seconds is very arbitrary, could just take 2 checks or similar before it turns critical instead, any suggestion is welcomed.

I have read the CLA Document and I hereby sign the CLA or my organization already has a signed CLA.

@CheckmkCI CheckmkCI closed this in 457ac21 Nov 29, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Nov 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants