pmon3
is a process manager (currently supports Linux and MacOSX) which allows you to keep processes alive forever. Processes can be declared directly via the CLI or in a configuration file. pmon3
allows you to get started quickly while providing configuration granularity at both the system and process levels.
- Introduction
- Installation
- CLI Commands
- System Configuration
- Process Configuration
- Process Dependencies
- Groups
- Event Handling
- Flap Detection/Prevention
- Debugging
- Performance
- Problems
Golang currently has no officially supported process management tools. For managing processes, some use built-in commands such as nohup [process] &
, or the process management tools provided by the operating system such as systemd. Alternatively, third-party process management tools such as Supervisor (python) or PM2 (node.js) can also be utilized.
Unlike PM2, pmon3
is managed directly by the OS process manager, so if the pmon3
CLI abnormally terminates, it will not affect the pmond
daemon process. This is currently achieved by separating the pmond
daemon process from the pmon3
CLI.
By default, if a process abnormally terminates, pmond
will try to restart the process. If you don't want a process to restart automatically, you can provide the --no-autorestart
parameter flag from the CLI.
git clone https://github.com/joe-at-startupmedia/pmon3/ && cd pmon3
go mod tidy
go build -o bin/pmon3 cmd/pmon3/pmon3.go
go build -o bin/pmond cmd/pmond/pmond.go
cp -R bin/pmon* /usr/local/bin/
#create the configuration directory
mkdir -p /etc/pmon3/config/
cp config.yml /etc/pmon3/config/
#start the daemon
sudo /usr/local/bin/pmond &
The systemd installation process entails the following steps:
- create the configuration directory
- create the log rotation file
- create the bash completion profile (requires the bash-completion package)
- enable and start the
pmond
system service
#build the project
make build
#install on systemd-based system
make systemd_install
wget -O - https://raw.githubusercontent.com/joe-at-startupmedia/pmon3/master/release-installer.bash | bash
âť— After installing pmon3
for the first time, both installation methods provided above should automatically enable and start the service. if the pmond
service does not start automatically, you need to manually start the service.
sudo systemctl start pmond
# Others
sudo /usr/local/bin/pmond &
Usage:
pmon3 [command]
Available Commands:
completion Generate completion script
del Delete process by id or name
desc Show process information by id or name
dgraph Show the process queue order
drop Delete all processes
exec Spawn a new process
export Export Process Configuration
group Group level commands
help Help about any command
init Initialize all stopped processes
kill Terminate all processes
log Display process logs by id or name
logf Tail process logs by id or name
ls List all processes
reset Reset the restart counter(s)
restart (Re)start a process by id or name
stop Stop a process by id or name
topn Shows processes with unix top cmd
version
Flags:
-h, --help help for pmon3
Use "pmon3 [command] --help" for more information about a command.
pmon3 exec [process_binary] [flags]
The starting process accepts several parameters. The parameter descriptions are as follows:
// The process name. It will use the file name of the binary as the default name if not provided
--name
// Where to store logs. It will override the confuration files `logs_dir` property
--log-dir
// The absolute path of a custom log file
--log -l
// Provide parameters to be passed to the binary, multiple parameters are separated by spaces
--args -a "-arg1=val1 -arg2=val2"
// Provide environment variables (appended to those already existing on the system `os.Environ()`)
--env-vars "ENV_VAR_1=env_var_1_value ENV_VAR_2=env_var_2_value"
// managing user
--user -u
// Do not restart automatically. It will automatically restart by default.
--no-autorestart -n
// Provide a list of process names that this process will depend on
--dependencies parent-process-name [--dependencies parent-process-name2]...
//provide a list of group names this process is associated to
--groups group-name-one [--groups group-name-2]
pmon3 exec ./bin/gin --args "-prjHome=`pwd`" --user joe
âť— Parameter arguments need to use the absolute path.
pmon3 ls
pmon3 restart [id or name]
pmon3 stop [id or name]
# view logs of the process specified
pmon3 log [id or name]
# view logs of the process specified including those previously rotated/archived
pmon3 log -a [id or name]
# Similar to using `tail -f xxx.log`
pmon3 logf [id or name]
pmon3 del [id or name]
pmon3 show [id or name]
pmon3 kill [--force]
#(re)start processes specified in the Process Config only
pmon3 init --process-config-only
#(re)start processes specified in the Process Config and those which already exist in the database
pmon3 init
pmon3 drop [--force]
This command is useful to debug dependency resolution without (re)starting processes
#processes specified in the Process Config only
pmon3 dgraph --process-config-only
#processes specified in the Process Config and the database
pmon3 dgraph
This command will reset the restart counter back to 0. This is useful to establish a clean slate without having to restart the pmond process.
#reset the restart counter on all processes
pmon3 reset
#reset the restart counter on a specific process
pmon3 reset [process_id_or_name]
This command is useful when you want to generate Process Configuration to use for pmond initialization from the specified process_config_file
.
pmon3 export
#specify json as a format (default)
pmon3 export json
#specify toml as a format
pmon3 export toml
#specify yaml as a format
pmon3 export yaml
#order by name ascending instead of id
pmon3 export yaml -o name
This will output the resource utilization of all processes using the native top
command that is pre-installed on most unix-based operating systems. It will only show those processes managed by (and including) the pmond
process.
pmon3 topn
The default path of the configuration file is /etc/pmon3/config/config.yml
. This value can be overridden with the PMON3_CONF
environment variable.
The following configuration options are available:
# log levels: debug/info/warn/error
log_level: info
# kill processes on termination
handle_interrupts: true
# whether to reload from the configuration before executing a command
disable_reloads: false
# poll processes every [n] milliseconds
process_monitor_interval: 500
# wait [n] seconds before monitoring process statuses
initialization_period: 30
# a configuration file to specify a list of processes to start on the first initialization (json, yaml or toml)
process_config_file: /etc/pmon3/config/process.config.json
wait:
# wait [n] milliseconds before outputting list after running init/stop/restart/kill/drop/exec
cmd_exec_response: 1500
# wait [n] milliseconds after connecting to IPC client before issuing commands
ipc_connection: 0
# wait [n] milliseconds after enqueueing a dependent process
dependent_process_enqueued: 1000
#permission settings applied to files and directories of data/logs/message_queue
permissions:
# an OS user to access files in data/logs/message_queue directory
user:
# an OS group to access files in data/logs/message_queue directory (must also provide a user)
group:
# the mode to apply to the data/logs/message_queue directory
directory_mode: 0770
# the mode to apply to files within the data/logs/message_queue directory
file_mode: 0660
#settings specific to the logs directory (inherits from permissions by default)
logs:
# directory where the logs are stored
directory: /var/log/pmond
# an OS user to access files in logs directory
user:
# an OS group to access files in logs directory (must also provide a user)
group:
# the mode to apply to the logs directory
directory_mode: 0775
# the mode to apply to files within the logs directory
file_mode:
#settings specific to the data directory (inherits from permissions by default)
data:
# directory where the database is stored
directory: /etc/pmon3/data
# an OS user to access files in data directory
user:
# an OS group to access files in data directory (must also provide a user)
group:
# the mode to apply to the data directory
directory_mode:
# the mode to apply to files within the data directory
file_mode:
#settings specific to the message_queue directory (inherits from permissions by default)
message_queue:
# directory where the messages are stored
directory:
# custom shared memory directory
shmem: /dev/shm/
# custom posix_mq directory
posix_mq: /dev/mqueue/
# a string to append to the name of the queue
name_suffix:
# an OS user to access files in message_queue directory
user:
# an OS group to access files in message_queue directory (must also provide a user)
group:
# the mode to apply to the message_queue directory
directory_mode: 0775
# the mode to apply to files within the message_queue directory
file_mode: 0666
event_handling:
# a script to execute when a process is restarted which accepts the process details json as the first argument
process_restart:
# a script to execute when a process fails (--no-autorestart) which accepts the process details json as the first argument
process_failure:
# a script to execute when a process backs off when flap detection is enabled
process_backoff:
flap_detection:
# enable flap detection
is_enabled: false
# the amount of times a process can restart (within the countdown threshold) until backoff evaluation begins
threshold_restarted: 5
# the amount of process monitor intervals during a processes backoff period until process evaluation proceeds as normal
threshold_countdown: 120
# the amount of process monitor intervals during a processes backoff period until the processes restart counter is decremented. disabled with 0 value.
threshold_decrement: 60
Restarting pmond is usually unnecessary: All configuration changes should take effect when the next command is issued.
The configuration values can be overridden using environment variables:
CONFIGOR_LOGLEVEL
CONFIGOR_HANDLEINTERRUPTS
CONFIGOR_PROCESSMONITORINTERVAL
CONFIGOR_INITIALIZATIONPERIOD
CONFIGOR_PROCESSCONFIGFILE
CONFIGOR_WAIT_CMDEXECRESPONSE
CONFIGOR_WAIT_IPCCONNECTION
CONFIGOR_WAIT_DEPENDENTPROCESSENQUEUED
CONFIGOR_PERMISSIONS_USER
CONFIGOR_PERMISSIONS_GROUP
CONFIGOR_PERMISSIONS_DIRECTORYMODE
CONFIGOR_PERMISSIONS_FILEMODE
CONFIGOR_LOGS_DIRECTORY
CONFIGOR_LOGS_USER
CONFIGOR_LOGS_GROUP
CONFIGOR_LOGS_DIRECTORYMODE
CONFIGOR_LOGS_FILEMODE
CONFIGOR_DATA_DIRECTORY
CONFIGOR_DATA_USER
CONFIGOR_DATA_GROUP
CONFIGOR_DATA_DIRECTORYMODE
CONFIGOR_DATA_FILEMODE
CONFIGOR_MESSAGEQUEUE_DIRECTORY
CONFIGOR_MESSAGEQUEUE_DIRECTORY_SHMEM
CONFIGOR_MESSAGEQUEUE_DIRECTORY_POSIXMQ
CONFIGOR_MESSAGEQUEUE_NAMESUFFIX
CONFIGOR_MESSAGEQUEUE_USER
CONFIGOR_MESSAGEQUEUE_GROUP
CONFIGOR_MESSAGEQUEUE_DIRECTORYMODE
CONFIGOR_MESSAGEQUEUE_FILEMODE
CONFIGOR_EVENTHANDLER_PROCESSRESTART
CONFIGOR_EVENTHANDLER_PROCESSFAILURE
CONFIGOR_FLAPDETECTION_ISENABLED
CONFIGOR_FLAPDETECTION_THRESHOLDRESTARTED
CONFIGOR_FLAPDETECTION_THRESHOLDCOUNTDOWN
CONFIGOR_FLAPDETECTION_THRESHOLDDECREMENT
By default, when pmond
is restarted from a previously stopped state, it will load all processes in the database that were:
- previously running
- have been marked as stopped as a result of pmond closing
- have
--no-autorestart
set to false (default value)
# a configuration file to specify a list of processes to start on the first initialization (json, yaml or toml)
process_config_file: /etc/pmon3/config/process.config.json
{
"processes": [
{
"file": "/usr/local/bin/happac",
"name": "happac1",
"args": "-h startup-patroni-1.node.consul -p 5555 -r 5000",
"user": "vagrant",
"log_dir": "/var/log/custom/",
"dependencies": ["happac2"],
"groups": ["happac"]
},
{
"file": "/usr/local/bin/happab",
"name": "happac2",
"log": "/var/log/happac2.log",
"args": "-h startup-patroni-1.node.consul -p 5556 -r 5001",
"user": "vagrant",
"no_auto_restart": true,
"groups": ["happac"]
},
{
"file": "/usr/local/bin/node",
"name": "metabase-api",
"args": "/var/www/vhosts/metabase-api/index.js",
"env_vars": "NODE_ENV=prod",
"user": "dw_user"
}
]
}
processes:
- file: "/usr/local/bin/happac"
name: happac1
args: "-h startup-patroni-1.node.consul -p 5555 -r 5000"
user: vagrant
log_dir: "/var/log/custom/"
dependencies:
- happac2
groups:
- happac
- file: "/usr/local/bin/happab"
name: happac2
log: "/var/log/happac2.log"
args: "-h startup-patroni-1.node.consul -p 5556 -r 5001"
user: vagrant
no_auto_restart: true
groups:
- happac
- file: "/usr/local/bin/node"
name: metabase-api
args: "/var/www/vhosts/metabase-api/index.js"
env_vars: NODE_ENV=prod
user: dw_user
Unlike json and yaml, all fields are camel-cased:
[[Processes]]
File = "/usr/local/bin/happac"
Name = "happac1"
Args = "-h startup-patroni-1.node.consul -p 5555 -r 5000"
User = "vagrant"
LogDir = "/var/log/custom/"
Dependencies = [ "happac2" ]
Groups = [ "happac" ]
[[Processes]]
File = "/usr/local/bin/happab"
Name = "happac2"
Log = "/var/log/happac2.log"
Args = "-h startup-patroni-1.node.consul -p 5556 -r 5001"
User = "vagrant"
NoAutoRestart = true
Groups = [ "happac" ]
[[Processes]]
File = "/usr/local/bin/node"
Name = "metabase-api"
Args = "/var/www/vhosts/metabase-api/index.js"
EnvVars = "NODE_ENV=prod"
User = "dw_user"
Instead of configuring this file from scratch you can use the export command to output the configuration from the current process list. This allows the administrator to build a process list using imperative commands followed by exporting the results to a configuration file.
All possible flags
values matching those specified in the exec command:
- file
- user
- log
- log_dir
- no_auto_restart
- args
- env_vars
- name
- dependencies
- groups
Dependencies (provided as an array) determine the order in which the processes are started. They are sorted using a directed acyclic graph meaning that there cannot be cyclical dependencies between processes. Dependency resolution can be debugged using the dgraph command. Parent processes can wait [n]
amount of seconds between spawning dependent processes by utilizing the dependent_process_enqueued
configuration variable (currently defaults to 2
seconds).
Groups are useful when dealing with a large amount of related processes. Like processes, they are stored in the database and provide many-to-many cardinality. This allows the ability to associate multiple groups to one or more processes and vice versa. Groups can be managed via Process Configuration and through the CLI.
Group level commands
Usage:
pmon3 group [command]
Aliases:
group, groups
Available Commands:
assign Assign group(s) to process(es)
create Create a new group
del Delete a group
desc Show group details and associated processes
drop Delete all processes associated to a group
ls List all groups
remove Remove process(es) from group(s)
restart (Re)start processes by group id or name
stop Stop all processes associated to a group
Flags:
-h, --help help for group
Use "pmon3 group [command] --help" for more information about a command.
First lets create a group
pmon3 group create happac
Next, lets assign it to processes with ids 3 and 4
pmon3 group assign happac 3,4
Now, lets confirm that it was associated to the correct processes:
pmon3 group desc happac
Oops, process 4 should not be associated with the happac group, lets remove the association:
pmon3 group remove happac 4
This group is no longer useful, lets delete it while keeping the processes intact:
pmon3 group del happac
If you make a change to the group in the Process Configuration while pmond is running, you can make the changes take effect by running the init
command. The init
command should not restart processes which are already running but it will apply changes from the process_config_file
.
pmon3 init
event_handling:
# a script to execute when a process is restarted which accepts the process details json as the first argument
process_restart:
# a script to execute when a process fails (--no-autorestart) which accepts the process details json as the first argument
process_failure:
# a script to execute when a process backs off when flap detection is enabled
process_backoff:
1. Specify the executable script to run for the process_restart
value. pmond
will pass a json-escaped string of the process details as the first argument.
event_handling:
# a script to execute when a process is restarted which accepts the process details json as the first argument
process_restart: "/etc/pmon3/bin/on_restart.bash"
PROCESS_JSON="$1"
PROCESS_ID=$(echo "${PROCESS_JSON}" | jq '.id')
PROCESS_NAME=$(echo "${PROCESS_JSON}" | jq '.name')
echo "process restarted: ${PROCESS_ID} - ${PROCESS_NAME}" >> /var/log/pmond/output.log
$ PMON3_DEBUG=true pmond
INFO/vagrant/go_src/pmon3/pmond/observer/observer.go:29 pmon3/pmond/observer.HandleEvent() Received event: &{restarted 0xc0001da630}
WARN/vagrant/go_src/pmon3/pmond/observer/observer.go:47 pmon3/pmond/observer.onRestartEvent() restarting process: happac3 (3)
DEBU/vagrant/go_src/pmon3/pmond/observer/observer.go:70 pmon3/pmond/observer.onEventExec() Attempting event executor(restarted): /etc/pmon3/bin/on_restart.bash "{\"id\":3,\"created_at\":\"2024-05-03T05:44:25.114957302Z\",\"updated_at\":\"2024-05-03T06:09:18.71222185Z\",\"pid\":4952,\"log\":\"/var/log/pmond/acf3f83.log\",\"name\":\"happac3\",\"process_file\":\"/usr/local/bin/happac\",\"args\":\"-h startup-patroni-1.node.consul -p 5557 -r 5002\",\"status\":2,\"auto_restart\":true,\"uid\":1000,\"username\":\"vagrant\",\"gid\":1000}"
$ tail /var/log/pmond/output.log
process restarted: 4 - "happac4"
Flap Detection provides the ability to detect processes which are in a perpetually-failed state resulting in excessive restarts which can affect system performance among other things. To prevent excessive restarts you can enable flap detection in the configuration file. You can control the behavior by setting various threshold parameters which ultimately affect how often a perpetually-failed process can restart within a given time interval.
flap_detection:
# enable flap detection
is_enabled: true
Defaulted to 5
, is the amount of process restarts before the flap prevention process begins at which point the process will cease restarts and enter the backoff state.
flap_detection:
# the amount of times a process can restart (within the countdown threshold) until backoff evaluation begins
threshold_restarted: 5
Defaulted to 120
, is the amount of process monitor intervals until the flap prevention process (backoff state) ends and the process (if still in a perpetually failed state) will resume restarting as normal until the restart threshold is met again. The process monitor interval can also be set in the configuration file which would affect the time in which it would take to countdown back to zero.
flap_detection:
# the amount of process monitor intervals during a processes backoff period until process evaluation proceeds as normal
threshold_countdown: 120
Defaulted to 60
, disabled with 0
, is the amount of process monitor intervals during the flap prevention process (backoff state) until the internal process restart counter is decremented. This can affect how the countdown is reached effectively staggering process restarts during the countdown process. This is useful when you don't want to completely back off and allow for intermittent restarts during the flap prevention process (backoff state).
flap_detection:
# the amount of process monitor intervals during a processes backoff period until the processes restart counter is decremented. disabled with 0 value.
threshold_decrement: 60
Using the defaults provided above: since the process monitor interval defaults to 500
milliseconds, when a process enters the perpetually failed state and restarts the 5th time, it will enter the backoff state. At this point the flap prevention process will begin the countdown from 120
to 0
. Since the process monitor interval is 500
milliseconds, it will take 120
multiplied by 500
milliseconds (60
seconds).
caveat: because the decrement threshold is 60
, the internal restart counter will decrement from 5
to 4
after 60
multiplied by 500
milliseconds (30
seconds). It will restart 30
seconds into the backoff state instead of 60
seconds and continue counting down to 0
until exiting the flap detection process. If the process remains in a perpetually failed state, it will take 5
restarts to repeat this process all over again.
You can specify debug verbosity from both the pmon3
client and the pmond
daemon process using the PMON3_DEBUG
environment variable.
PMON3_DEBUG=true pmond
PMON3_DEBUG
accepts the following values:
true
: sets the debug level to debugdebug
: has the same effect as trueinfo
: sets the debug level to infowarn
: sets the debug level to warnerror
: sets the debug level to error
You can also debug the underlying IPC library using QOG_DEBUG=true
XIPC_DEBUG=true PMON3_DEBUG=true pmon3 ls
You can set the log level in the yaml configuration file.
#possible values: debug/info/warn/error
#default value when empty or omitted
log_level: "info"
If you do not specify a value, info
will be the default Logrus level.
By default, no underlying libraries require CGO. This allows for portability between machines using different versions of GLIBC and also provides easy installation using the Release Installer . Benchmarking results have confirmed less memory and CPU utilization compared to using the libraries which do require CGO_ENABLED=1
provided below:
The posix_mq
build tag can be provided to swap out the underlying gipc library with posix_mq. The posix_mq
wrapper does require CGO_ENABLED=1
and is considerably faster but also consumes slightly more CPU and Memory. To enable posix_mq
during the build process:
BUILD_FLAGS="-tags posix_mq" make build_cgo
By default, pmon3
utilizes an non-CGO version of sqlite which is unnoticably less performant in most circumstances. To enable the CGO version of sqlite:
BUILD_FLAGS="-tags cgo_sqlite" make build_cgo
It depends on your requirements whether you need one or both. To enable both of these CGO-dependent modules for maximizing overall performance:
BUILD_FLAGS="-tags posix_mq,cgo_sqlite" make build_cgo
Or without using the Makefile:
CGO_ENABLED=1 go build -tags "posix_mq,cgo_sqlite" -o bin/pmon3 cmd/pmon3/pmon3.go
CGO_ENABLED=1 go build -tags "posix_mq,cgo_sqlite" -o bin/pmond cmd/pmond/pmond.go
Significantly less performant than the default shared memory implementation and posix_mq implementation. It also has the capability of utilizing TCP cockets with additional build flags (currently: build -tags net,network
).
BUILD_FLAGS="-tags net" make build
Or without using the Makefile:
CGO_ENABLED=0 go build -tags net -o bin/pmon3 cmd/pmon3/pmon3.go
CGO_ENABLED=0 go build -tags net -o bin/pmond cmd/pmond/pmond.go
pmon3
comes with a logrotate configuration file, which by default utilizes the /var/log/pmond
directory. If you require a custom log path, you can customize config.yml
and rpm/pmond.logrotate
If there is a path in the parameters you pass, please use the absolute path. The pmon3
startup process will start a new sandbox environment to avoid environmental variable pollution.
pmon3
provides Bash automation. If you find that the command cannot be automatically provided, please install bash-completion
and exit the terminal to re-enter:
sudo yum install -y bash-completion
autoload -U +X compinit && compinit
autoload -U +X bashcompinit && bashcompinit
sudo sh -c "pmon3 completion zsh > /etc/profile.d/pmon3.sh"
source /etc/profile.d/pmon3.sh
If you encounter the error above, make sure the pmond
service has started successfully.
sudo systemctl start pmond
You should only use sudo
to start the pmond
process which requires superuser privileges due to the required process forking commands. However, the pmon3
cli should be used without sudo
to ensure that the spawned processes are attached to the correct parent pid. When using sudo
, the processes will be attached to ppid 1 and as a result, will become orphaned if the pmond
process exits prematurely. Using sudo
also prevents non-root users from being able to access the log files.
You must have sudo privileges to do this for security reasons. The --user root
flag is redundant because the process is spawned as the calling user by default and commands ran as sudo are called by the root user.
sudo pmon3 exec /usr/local/bin/happac --user root