In this article we will focus on Monit which is a system monitoring and error recovery tool.
Let’s start with a simple question:
What is actually worth to be monitored?
Luckily the answer is also simple:
Everything what is critical to our business.
At application, service level it could be:
- process existence
- is the process responding on a given port
- resources consumption
- configuration file changes
- logfiles existence
At system level one should consider the following:
- filesystem checks
- network monitoring
- CPU monitoring
All those tasks can be accomplished with Monit, because it’s able to monitor:
- Files and Directories
Monit Installation on Gentoo
Install Monit on Gentoo with the following command:
On Gentoo You can always do a installation dry-run:
emerge --pretend monit
This basically says what would happen if You would install the given package
Installation From Source
Most packages (like the one on Ubuntu - contains Monit 5.6) are out of date. That’s why it’s recommended to install Monit from source.
Analyse this Docker image to learn how to install Monit on Ubuntu from source.
If you don’t want to visit that link, here is a short version:
apt-get install -y libssl-dev cd /tmp/ curl 'https://mmonit.com/monit/dist/monit-5.10.tar.gz' -O tar -xzvf monit-5.10.tar.gz rm monit-5.10.tar.gz cd monit-5.10/ ./configure --bindir=/usr/bin/ make && make install # Create required files mkdir -p /var/lib/monit # Make the init file executable chmod 755 /etc/init.d/monit
All Monit configuration files are kept inside of the
A standard file structure, to which I got used to looks like this:
/etc/monit ├── conf.d - Monit configuration files │ ├── apps - in-house applications │ ├── services - services like nginx, elasticsearch │ ├── sys - system related │ └── disabled - all disabled ├── bin - start/stop scripts │ └── services - services like nginx, elasticsearch ├── monitrc - main configuration file ├── monitrc.d - directory with examples └── templates
I assume that the application start/stop scripts are deployed together with the application codebase.
This way if developers change somethig in the application, Monit scripts don’t need to be adjusted (convention over configuration).
Just remember to force this standard.
Application Monitoring Script Template
A typical Monit scripts consists of the following parts:
- binaries monitoring
- configuration files monitoring
- log files monitoring
An example is presented below:
1 # Monit <application name> configuration 2 3 check <application name> matching "<phrase which should exits in the ps output>" 4 start program = "/bin/su <user> -c '/usr/lib/<application name>/current/bin/start.sh'" 5 stop program = "/bin/su <user> -c '/usr/lib/<application name>/current/bin/stop.sh'" 6 # CPU 7 if cpu > 5% for 2 cycles then alert 8 # RAM 9 if mem > 80 MB for 5 cycles then alert 10 11 # Check if the application directory exists 12 check directory <application name>-dir path /usr/lib/<application name>/current/ 13 if failed uid <user> then alert 14 if failed gid <user> then alert 15 if failed permission 0755 then alert 16 if does not exist then alert 17 18 # --- --- --- --- --- --- --- --- --- --- --- --- 19 # Configuration 20 # Check if the configuration directory is present 21 check <application name>-conf-dir path /usr/lib/<application name>/current/config/ 22 if failed uid <user> then alert 23 if failed gid <user> then alert 24 if failed permission 0775 then alert 25 if does not exist then alert 26 27 # Check if the unicorn configuration file is present 28 check <application name>-conf-file-unicorn with path /usr/lib/<application name>/current/config/unicorn.rb 29 if failed uid <user> then alert 30 if failed gid <user> then alert 31 if failed permission 0775 then alert 32 if does not exist then alert 33 if changed checksum then alert 34 35 # --- --- --- --- --- --- --- --- --- --- --- --- 36 # Logs 37 # Check if the log file is present 38 check file <application name>-log-file path /var/log/<application name>/access.log 39 if failed uid <user> then alert 40 if failed gid <user> then alert 41 if failed permission 0644 then alert 42 if does not exist then alert
I would recommend to spend some time with Monit’s documentation.
We all know that each application need an start script. The reason for that is the situation when the server get’s rebooted. The application should be able to come up automatically.
A good practice is to put the start/stop scripts to the same directory where the application is deployed.
The deployment should be adjusted so that it don’t interfere with Monit
Start Scripts for Ruby Applications
A sample Ruby start script template which uses RVM is presented below:
1 #!/bin/bash 2 3 # <application name> Start Script 4 # This script is executed by the "<user>" user 5 6 # Define the project directory 7 project_directory='/home/<user>/<application name>' 8 9 # Define ruby related details 10 ruby_version='ruby-2.0.0-p195' 11 ruby_gemset='<gemset name>' 12 13 14 # Start the application server 15 env LANG='en_US.UTF-8' \ 16 rvm_path=$HOME/.rvm/ \ 17 $HOME/.rvm/bin/rvm-shell \ 18 "$ruby_version@$ruby_gemset" \ 19 -c "cd $project_directory/current; RAILS_ENV=production $project_directory/current/bin/unicorn -c $project_directory/current/config/unicorn.rb -E production -D"
And just the relevant part for a Sinatra application:
# Start the application server env LANG='en_US.UTF-8' \ rvm_path=$HOME/.rvm/ \ $HOME/.rvm/bin/rvm-shell \ "$ruby_version@$ruby_gemset" \ -c "cd $project_directory/current; bundle exec rackup $project_directory/current/config.ru --env staging --pid $project_directory/current/tmp/pids/sinatra.pid --port 8080 >> /var/log/<application name>/out.log 2>> /var/log/<application name>/error.log &"
The most important part here is the usage of
rvm-shell which syntax looks like this:
rvm-shell "<ruby version>@<ruby gemset>" -c "<command which starts the ruby process>"
Debugging Monit Configurations
Now that we know where to put our configuration files let’s see how we can debug/test them.
The first thing we need to do is adjust Monit’s poll cycle length in the
set daemon 10 # Number in seconds
This value represents the time which Monit sleeps after each check. We need to keep it low for debugging purposes.
Then add a configuration file
/etc/monit/conf.d/apps/monit-test-app.conf with the following content:
check process monit-test-app with pidfile "/usr/lib/monit-test-app/pids/process.pid" start = "/usr/lib/monit-test-app/bin/start.sh" if does not exist then start
Enable it by adding the following line to
Our application start script is located under
/usr/lib/monit-test-app/bin/start.sh and looks like this:
1 #!/bin/bash 2 3 application_name='monit-test-app' 4 script_dirirectory="$( cd "$( dirname "$0" )" && pwd )" 5 application_dirirectory=$script_dirirectory/.. 6 7 log_file="/var/log/$application_name/stdout.log" 8 pid_file="/var/run/$application_name.pid" 9 10 echo -e "\n`date +"%y%m%d %H:%M:%S"` - start script" >> $log_file 11 12 echo $BASHPID >> $pid_file 13 14 echo "`date +"%y%m%d %H:%M:%S"` - sleep 5s" >> $log_file 15 sleep 5 16 17 echo "`date +"%y%m%d %H:%M:%S"` - remove pid file" >> $log_file 18 rm $pid_file
Remember to make the script executable
chmod u+x /usr/lib/monit-test-app/bin/start.sh
Finally start Monit with (if not already started):
service monit start # or: /etc/init.d/monit start
And reload Monit’s configuration:
Monit should start the
start.sh over and over again.
Check if Monit does it’s job by checking the logs:
$ tail -f /var/log/monit.log /var/log/monit-test-app/* ==> /var/log/monit.log <== [UTC Nov 27 08:07:57] error : 'monit-test-app' process is not running [UTC Nov 27 08:07:57] info : 'monit-test-app' start: /usr/lib/monit-test-app/bin/start.sh ==> /var/log/monit-test-app/stdout.log <== 141127 08:07:57 - start script 141127 08:07:57 - sleep 5s 141127 08:08:02 - remove pid file ==> /var/log/monit.log <== [UTC Nov 27 08:08:27] error : 'monit-test-app' failed to start (exit status 0) -- no output [UTC Nov 27 08:08:37] error : 'monit-test-app' process is not running [UTC Nov 27 08:08:37] info : 'monit-test-app' start: /usr/lib/monit-test-app/bin/start.sh
I also like to keep an eye on Monit’s summary in a separate terminal:
$ watch -n1 'monit summary' Every 1.0s: monit summary Thu Nov 27 00:47:43 2014 The Monit daemon 5.10 uptime: 26m Process 'monit-test-app' Execution failed System '6934259efb99' Running
Sometimes it’s helpful to run Monit in a verbose mode. Simply change the init file with this command:
sed -i 's/^MONIT_OPTS=$/MONIT_OPTS="-v"/' /etc/init.d/monit
And restart Monit so that the change can take place:
service monit restart
Useful Monit Commands
List all commands
Get Monit summary
Disable/unmonitor a service
monit unmonitor <service name>
If You want to disable the service permanently You can move the appropriate file to the
/etc/monit/conf.d/disableddirectory and then reload Monit with
View Monit logs
tail -f /var/log/monit.log