In this article we will focus on Monit which is a system monitoring and error recovery tool.
Let’s start with a simple question:
What is actually worth to be monitored?
Luckily the answer is also simple:
Everything what is critical to our business.
At application, service level it could be:
- process existence
- is the process responding on a given port
- resources consumption
- configuration file changes
- logfiles existence
At system level one should consider the following:
- filesystem checks
- network monitoring
- CPU monitoring
All those tasks can be accomplished with Monit, because it’s able to monitor:
- Files and Directories
- Disks
- Processes
- System
- Hosts
Installation
Monit Installation on Gentoo
Install Monit on Gentoo with the following command:
emerge monit
On Gentoo You can always do a installation dry-run:
emerge --pretend monit
Note
This basically says what would happen if You would install the given package
Installation From Source
Note
Most packages (like the one on Ubuntu - contains Monit 5.6) are out of date. That’s why it’s recommended to install Monit from source.
Analyse this Docker image to learn how to install Monit on Ubuntu from source.
If you don’t want to visit that link, here is a short version:
apt-get install -y libssl-dev
cd /tmp/
curl 'https://mmonit.com/monit/dist/monit-5.10.tar.gz' -O
tar -xzvf monit-5.10.tar.gz
rm monit-5.10.tar.gz
cd monit-5.10/
./configure --bindir=/usr/bin/
make && make install
# Create required files
mkdir -p /var/lib/monit
# Make the init file executable
chmod 755 /etc/init.d/monit
File Structure
All Monit configuration files are kept inside of the /etc/monit
directory.
A standard file structure, to which I got used to looks like this:
/etc/monit
├── conf.d - Monit configuration files
│ ├── apps - in-house applications
│ ├── services - services like nginx, elasticsearch
│ ├── sys - system related
│ └── disabled - all disabled
├── bin - start/stop scripts
│ └── services - services like nginx, elasticsearch
├── monitrc - main configuration file
├── monitrc.d - directory with examples
└── templates
Note
I assume that the application start/stop scripts are deployed together with the application codebase.
This way if developers change somethig in the application, Monit scripts don’t need to be adjusted (convention over configuration).
Just remember to force this standard.
Application Monitoring Script Template
A typical Monit scripts consists of the following parts:
- binaries monitoring
- configuration files monitoring
- log files monitoring
An example is presented below:
1 # Monit <application name> configuration
2
3 check <application name> matching "<phrase which should exits in the ps output>"
4 start program = "/bin/su <user> -c '/usr/lib/<application name>/current/bin/start.sh'"
5 stop program = "/bin/su <user> -c '/usr/lib/<application name>/current/bin/stop.sh'"
6 # CPU
7 if cpu > 5% for 2 cycles then alert
8 # RAM
9 if mem > 80 MB for 5 cycles then alert
10
11 # Check if the application directory exists
12 check directory <application name>-dir path /usr/lib/<application name>/current/
13 if failed uid <user> then alert
14 if failed gid <user> then alert
15 if failed permission 0755 then alert
16 if does not exist then alert
17
18 # --- --- --- --- --- --- --- --- --- --- --- ---
19 # Configuration
20 # Check if the configuration directory is present
21 check <application name>-conf-dir path /usr/lib/<application name>/current/config/
22 if failed uid <user> then alert
23 if failed gid <user> then alert
24 if failed permission 0775 then alert
25 if does not exist then alert
26
27 # Check if the unicorn configuration file is present
28 check <application name>-conf-file-unicorn with path /usr/lib/<application name>/current/config/unicorn.rb
29 if failed uid <user> then alert
30 if failed gid <user> then alert
31 if failed permission 0775 then alert
32 if does not exist then alert
33 if changed checksum then alert
34
35 # --- --- --- --- --- --- --- --- --- --- --- ---
36 # Logs
37 # Check if the log file is present
38 check file <application name>-log-file path /var/log/<application name>/access.log
39 if failed uid <user> then alert
40 if failed gid <user> then alert
41 if failed permission 0644 then alert
42 if does not exist then alert
I would recommend to spend some time with Monit’s documentation.
Start/Stop Scripts
We all know that each application need an start script. The reason for that is the situation when the server get’s rebooted. The application should be able to come up automatically.
A good practice is to put the start/stop scripts to the same directory where the application is deployed.
Note
The deployment should be adjusted so that it don’t interfere with Monit
Start Scripts for Ruby Applications
A sample Ruby start script template which uses RVM is presented below:
1 #!/bin/bash
2
3 # <application name> Start Script
4 # This script is executed by the "<user>" user
5
6 # Define the project directory
7 project_directory='/home/<user>/<application name>'
8
9 # Define ruby related details
10 ruby_version='ruby-2.0.0-p195'
11 ruby_gemset='<gemset name>'
12
13
14 # Start the application server
15 env LANG='en_US.UTF-8' \
16 rvm_path=$HOME/.rvm/ \
17 $HOME/.rvm/bin/rvm-shell \
18 "$ruby_version@$ruby_gemset" \
19 -c "cd $project_directory/current; RAILS_ENV=production $project_directory/current/bin/unicorn -c $project_directory/current/config/unicorn.rb -E production -D"
And just the relevant part for a Sinatra application:
# Start the application server
env LANG='en_US.UTF-8' \
rvm_path=$HOME/.rvm/ \
$HOME/.rvm/bin/rvm-shell \
"$ruby_version@$ruby_gemset" \
-c "cd $project_directory/current; bundle exec rackup $project_directory/current/config.ru --env staging --pid $project_directory/current/tmp/pids/sinatra.pid --port 8080 >> /var/log/<application name>/out.log 2>> /var/log/<application name>/error.log &"
The most important part here is the usage of rvm-shell
which syntax looks like this:
rvm-shell "<ruby version>@<ruby gemset>" -c "<command which starts the ruby process>"
Debugging Monit Configurations
Now that we know where to put our configuration files let’s see how we can debug/test them.
The first thing we need to do is adjust Monit’s poll cycle length in the /etc/monit/monitrc
file:
set daemon 10 # Number in seconds
This value represents the time which Monit sleeps after each check. We need to keep it low for debugging purposes.
Then add a configuration file /etc/monit/conf.d/apps/monit-test-app.conf
with the following content:
check process monit-test-app with pidfile "/usr/lib/monit-test-app/pids/process.pid"
start = "/usr/lib/monit-test-app/bin/start.sh"
if does not exist then start
Enable it by adding the following line to /etc/monit/monitrc
:
include /etc/monit/conf.d/apps/*.conf
Our application start script is located under /usr/lib/monit-test-app/bin/start.sh
and looks like this:
1 #!/bin/bash
2
3 application_name='monit-test-app'
4 script_dirirectory="$( cd "$( dirname "$0" )" && pwd )"
5 application_dirirectory=$script_dirirectory/..
6
7 log_file="/var/log/$application_name/stdout.log"
8 pid_file="/var/run/$application_name.pid"
9
10 echo -e "\n`date +"%y%m%d %H:%M:%S"` - start script" >> $log_file
11
12 echo $BASHPID >> $pid_file
13
14 echo "`date +"%y%m%d %H:%M:%S"` - sleep 5s" >> $log_file
15 sleep 5
16
17 echo "`date +"%y%m%d %H:%M:%S"` - remove pid file" >> $log_file
18 rm $pid_file
Note
Remember to make the script executablechmod u+x /usr/lib/monit-test-app/bin/start.sh
Finally start Monit with (if not already started):
service monit start # or: /etc/init.d/monit start
And reload Monit’s configuration:
monit reload
Monit should start the start.sh
over and over again.
Check if Monit does it’s job by checking the logs:
$ tail -f /var/log/monit.log /var/log/monit-test-app/*
==> /var/log/monit.log <==
[UTC Nov 27 08:07:57] error : 'monit-test-app' process is not running
[UTC Nov 27 08:07:57] info : 'monit-test-app' start: /usr/lib/monit-test-app/bin/start.sh
==> /var/log/monit-test-app/stdout.log <==
141127 08:07:57 - start script
141127 08:07:57 - sleep 5s
141127 08:08:02 - remove pid file
==> /var/log/monit.log <==
[UTC Nov 27 08:08:27] error : 'monit-test-app' failed to start (exit status 0) -- no output
[UTC Nov 27 08:08:37] error : 'monit-test-app' process is not running
[UTC Nov 27 08:08:37] info : 'monit-test-app' start: /usr/lib/monit-test-app/bin/start.sh
I also like to keep an eye on Monit’s summary in a separate terminal:
$ watch -n1 'monit summary'
Every 1.0s: monit summary Thu Nov 27 00:47:43 2014
The Monit daemon 5.10 uptime: 26m
Process 'monit-test-app' Execution failed
System '6934259efb99' Running
Sometimes it’s helpful to run Monit in a verbose mode. Simply change the init file with this command:
sed -i 's/^MONIT_OPTS=$/MONIT_OPTS="-v"/' /etc/init.d/monit
And restart Monit so that the change can take place:
service monit restart
Useful Monit Commands
-
List all commands
monit -h
-
Get Monit summary
monit summary
-
Disable/unmonitor a service
monit unmonitor <service name>
If You want to disable the service permanently You can move the appropriate file to the
/etc/monit/conf.d/disabled
directory and then reload Monit withmonit reload
-
View Monit logs
tail -f /var/log/monit.log
Resources: