Deadman

Version: 
0.7
Release date: 
Tuesday, 5 December, 2017

License:

Interface:

Authors/Port authors:

DEADMAN attemps to detect when a system is not operating properly and to reboot the system when this occurs.

Manual installation

Program is distributed as ZIP package: download to temporary directory and unpack to destination folder. See below for download link(s).

Following links are to additional programs, not mandatory but useful:

Following ones are the download links for manual installation:

Deadman v. 0.7 (11/7/2022, Steven Levine) Readme/What's new
deadman user guide v0.7 2017-12-05 SHL Version 0.1 2022-04-03 SHL Version 0.3 2022-06-21 SHL Version 0.5 2022-06-23 SHL Version 0.6 2022-06-23 SHL Version 0.7 == Introduction == Deadman attempts to detect when one or more of a known set of problems the might occur on a running system and to take appropriate recovery actions when one of these problems is detected. Deadman is an evolving application. New features are added when new failure modes and/or new recovery modes are discovered. Deadman was originally written to keep apache httpd servers that I maintain up and running with minimal human interaction so some of deadman's features are specific httpd servers. Other features are more generic and may be useful for use with other applications. Deadman logs its actions to the deadman.log log file. The log file will be written to the %LOGFILES% directory if defined. Otherwise it will be written the %TEMP% directory. Deadman also logs its actions the STDOUT, unless it is running detached. Deadman writes its PID to deadman.pid in the %TEMP% directory. This allows other processes to check and/or control deadman. == Usage == Deadman is a VIO command line application which is typically run detached. Output is written to the standard output, if deadman is not running detached, and to the log file (%LOGFILES%\deadman.log). The log file entries are timestamped so that they can be correlated with information from other timestamped logs. Each log file entry includes a message id of the form (#number). The id number can be used to locate the code that generated the message, if needed. To display the help screen, enter deadman.exe -? at the command line. The help screen currently displays as: The deadman daemon checks system health based on the configuration file settings. See deadman.txt for a detailed description of operation and options. deadman [-c] [-h] [-s] [-t] [-v] [-V] [-?] [cfgfile] -c Check daemon status -h -? Display this message -s Stop daemon -t Run in TEST mode -v Display verbose status -V Display version cfgfile Configuration file to process Copyright (c) 2008-2022 Steven Levine and Associates, Inc. All rights reserved. == Theory of operation == Deadman attempts to monitor system health by watching the state of a user selected files, as defined in the configuration file. Deadman can: - monitor one or more transaction log files for activity - monitor one or more error log files for certain errors - reboot the system on request When monitoring a transaction log file for activity, deadman expects the file size to increase over time. If the file size fails to increase for longer than the configured interval, deadman will attempt to reboot the system after the reboot delay expires. The check interval and the reboot delay interval are both configurable. Deadman contains logic to handle log rotation which will cause the log file size to be reduced. When monitoring an error log file for errors, deadman will check the log file for known errors at configurable intervals. The set of known errors is currently: - httpd cannot create child process As deadman evolves additional checks may be implemented. When one of the known errors is detected, deadman will perform error specific recovery actions. If the recovery actions fail, deadman will attempt to reboot the system after the reboot delay expires. The check interval and the reboot delay interval are both configurable. When monitoring for reboot requests, deadman checks if the reboot request file has been cremated. When the file is created, deadman will attempt to reboot the system. If the reboot request file is not empty, deadman will write the first line of the file to the deadman log file to record the reason for the reboot. == Sample configuration file == The Configuration File section describes the configuration file in more detail. ; hostname: steven, domain: www.scoug.com ; checks error log for child process start failures ; checks transaction log for lack of activity ; checks transaction log for lack of activity ; 2022-04-03 SHL Baseline - steven translogfile = d:\logs\apache\scoug-combined_log processname = httpd TransLogCheckIntervalSec = 60 ; 1 minute errlogfile = d:\logs\apache\scoug-error_log ErrorLogCheckIntervalSec = 30 rebootfile = d:\apps\apache24\reboot-me-now SleepSec = 10 RebootDelaySec = 3600 ; 1 hour, 0 suppresses reboots ForceStatusSec = 21600 ; 6 hours == Sample command lines == To start deadman in VIO mode start "deadman" deadman d:\apps\bin\deadman.cfg To start deadman detached detach "deadman" deadman d:\apps\bin\deadman.cfg To check if deadman daemon is running: deadman -c To stop the running instance of deadman: deadman -s == The Configuration File == Deadman is controlled by the settings provided in the configuration file. The configuration file contains one statement per line. Each statement consists of a keyword and a value. The configuration file may contain comments and blank lines. All keywords are optional. If a keyword enables a feature, the feature will not be enabled if the keyword is omitted. If the keyword sets a time interval, a default interval will be set if the keyword is omitted. The translogfile keyword names a transaction log file and enables the transaction log file monitor feature. Deadman monitors this file for growth. If the file stops growing for longer than the configured interval, deadman will schedule a reboot. There is no default for the transaction log file. If this keyword is omitted, transaction log monitoring will not be enabled. To monitor multiple transaction log files, specify each log file in a separate translogfile statement. The translogcheckintervalsec keyword defines how often the transaction log monitor feature will check the transaction log file. If this keyword is omitted, the default check interval is 600 seconds (i.e. 5 minutes). The processname keyword names the process that is responsible for writing to the configured transaction log file. If a process name is defined, deadman monitors the processes with this name. If there are no instances with this process name running, deadman assumes that the user has stopped the processes for maintenance and suspends transaction log file monitoring until one or more instances of the process are restarted. This prevents deadman from rebooting during planned shutdowns of these processes. There is no default for the process name. If this keyword is omitted, process monitoring will not be enabled. The errlogfile keyword names an apache httpd error log file and enables the httpd error log monitoring feature. Deadman will monitor the error log file for httpd child create failures. There is no default for the error log file. If this keyword is omitted, error log monitoring will not be enabled. To monitor multiple error log files, specify each log file in a separate errlogfile statement. The errorlogcheckintervalsec keyword defines how often deadman will check the configured error log file. If this keyword is omitted, the default check interval is 600 seconds (i.e. 5 minutes). The rebootfile keyword names the reboot request file and enables the reboot request feature. If this file exists, deadman will reboot the system. If this file exists when deadman is started, it will be deleted to prevent a stale reboot request file from triggering a reboot. There is no default for the reboot request file. If the keyword is omitted, reboot request monitoring will not be enabled. The sleepsec keyword defines how long deadman sleeps between check cycles. If this keyword is omitted, the default interval is 30 seconds. The rebootdelaysec keyword defines how long deadman waits after scheduling a reboot to perform the reboot. This allows for intermittent errors to be reported without forcing an unneeded reboot. If this keyword is omitted, the default delay interval is 30 seconds. The forcestatussec keyword defines how long deadman will wait before writing a proof of life message to the deadman log file. If this keyword is omitted, the default reporting interval is 21,600 seconds (i.e. 6 hours). == Tuning deadman == Every system is different. The goal of tuning the deadman timing parameters is to check often enough so that problems can be detected and effectively handled, while at the same time miminizing false positives and not checking so often as to waste system resources that could be better used elsewhere. When tuning deadman, it is recommended that deadman be run in test mode (i.e. -t). Test mode suppresses reboots and reduces the forcestatussec check interval which makes the the tuning process more efficient. When tuning deadman, the deadman log file can be helpful. Look for spurious reports that can be avoided by optimizing the timing parameters. Sleepsec defines the minimum reasonable value for all the other checking intervals. Translogcheckintervalsec should be set large enough to avoid most false positives, but small enough so that any reboot attempt occurs before the system has become so unstable that the reboot attempt will fail. Errorlogcheckintervalsec should be set large enough to avoid wasting system resources, but small enough so that the recovery attempt has a high probability of success. Rebootdelaysec large enough to allow intermittent reboot requests to clear, but small enough so that the reboot attempt occurs before the system has become so unstable that the reboot request will fail. == Running multiple deadman instances == If needed, you can run multiple instances of deadman. To do this, make a copy of deadman.exe giving it a unique name (i.e. deadman2.exe) and run the copy with a unique configuration file. The deadman log file name, the deadman pid file name and the default configuration file name are determined by the deadman executable's name so there will be no conflict with other running deadman instances. == Requirements == The dos.sys driver must be installed. This driver provides application level access to the DosReboot DevHlp API. == Known issues == None == Ideas for the future == - Enhance the error log monitor feature to detect more types of errors and provide recovery support. - Support units of measure for numeric values - Support deadmanlogfile keyword. - Support deadmanpidfile keyword. == Copyright and License == COVERED CODE IS PROVIDED UNDER THIS LICENSE ON AN "AS IS" BASIS, WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, WITHOUT LIMITATION, WARRANTIES THAT THE COVERED CODE IS FREE OF DEFECTS, MERCHANTABLE, FIT FOR A PARTICULAR PURPOSE OR NON-INFRINGING. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE COVERED CODE IS WITH YOU. SHOULD ANY COVERED CODE PROVE DEFECTIVE IN ANY RESPECT, YOU (NOT THE INITIAL DEVELOPER OR ANY OTHER CONTRIBUTOR) ASSUME THE COST OF ANY NECESSARY SERVICING, REPAIR OR CORRECTION. THIS DISCLAIMER OF WARRANTY CONSTITUTES AN ESSENTIAL PART OF THIS LICENSE. NO USE OF ANY COVERED CODE IS AUTHORIZED HEREUNDER EXCEPT UNDER THIS DISCLAIMER. Copyright (c) 2008-2022 Steven Levine and Associates, Inc. All rights reserved. Deadman is provided AS-IS, WITHOUT ANY WARRANTY OF ANY KIND, EITHER EXPRESS, IMPLIED OR STATUTORY, not even any implied warranty of MERCHANTABILITY. YOUR USE THIS PRODUCT IS CONDITIONED UPON YOUR ACCEPTANCE OF THIS LICENSE AGREEMENT. INSTALLING AND/OR USING THE PRODUCT INDICATES YOUR ACCEPTANCE OF THESE TERMS AND CONDITIONS. IF YOU DO NOT AGREE TO THESE TERMS AND CONDITIONS PROMPTLY DELETE THIS PRODUCT. You are granted a non-exclusive, non-assignable, non-transferable right to use deadman.exe. == eof ==
 www.warpcave.com/betas/deadman-0.7-20220711.zip
Record updated last time on: 02/07/2023 - 19:21

Translate to...

Comments

New Link http://www.warpcave.com/betas/deadman-0.7-20220711.zip

Add new comment