tech.gate.io blog
after migration to xymon 4.3.3 Beta 2 from 4.2.0, our rrd files, built with SPLITNCV where not working anymore
splitncv separates rrd values to own files, making it easy to generate graphs for an unknown number of output values using regexp
for network interface performance monitoring for example, where the number of interfaces may differ from host to host
no updates on existing rrds, and no new ones where created
we spent a lot of time in searching for configuration problems, maybe leading to this problem
but then we found this in xymon mailing list:
http://www.hswn.dk/hobbiton/2009/07/msg00242.html
in short:
go to xymon-4.3.0-beta2 directory
change to hobbitd/rrd
and edit the source file do_ncv.c
change line 180:
if (split_ncv && (paridx > 1)) {
to
if (split_ncv && (paridx > 0)) {
recompile, exchange the new compiled hobbitd_rrd with the one in
server/bin
restart hobbit
should be working now
For Web designers, it's important to know which screen and browser windows sizes the Web site's visitors use because a new design should be optimised in these ways:
1. existing space should be utilised as good as possible
2. horizontal scrolling should be avoided because it's very annoying for users
To get an idea of your users' settings, you can use the AWStats log analysing tool. It's free and generates graphical access statistics for your Web site. It comes with a JavaScript (awstats_misc_tracker.js) that you can embed on your site and when a user visits the site, some properties of his/her browser are reported to the logs in a form like this:
www.gimpusers.com xxx.xxx.xxx.xxx - - [14/Feb/2010:06:38:20 +0100] "GET /js/awstats_misc_tracker.js?screen=1280x800&win=1263x616&cdi=24&java=true&shk=n&svg=n&fla=y&rp=n&mov=y&wma=y&pdf=y&uid=awsuser_id1266125900481r8004&sid=awssession_id1266125900481r8004 HTTP/1.1" 200 2676 "http://www.gimpusers.com/forums/gimp-docs/10734-print-version-of-Gimp-Manual-pdf.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6" (35)
As you can see, this user had a 1280x800 screen resolution and the (useable) browser size was 1263x616. Sadly, the graphical report of AWStats is only able to report the screen resolutions but not the window sizes although the window sizes are more interesting (we don't expect our users to resize their browser window for our Web site).
To extract information about the window size probability distribution, we'll use R (http://www.r-project.org, or for Debian/Ubuntu, there's a package "r-recommended"), a free software environment for statistical computing and graphics. First, we have to prepare our log files and extract the AWStats lines:
awk -- '/&win=[[:digit:]]+x[[:digit:]]+/ { print $0; }' /var/log/apache2/access-gimpusers.log >raw
All lines containing &win=AAxBB will be printed to "raw".
Then we create two files with the window and screen widths:
awk -- '{ match($0, /\?screen=([[:digit:]]+)x[[:digit:]]+/, a); print a[1]; }' raw >screen
awk -- '{ match($0, /\&win=([[:digit:]]+)x[[:digit:]]+/, a); print a[1]; }' raw >win
Now we can start R and load the values:
$ R
> screen <- scan('screen')
Read 20914 items
> win <- scan('win')
Read 20914 items
Let's get a summary:
> summary(win)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1 1027 1263 1258 1358 2545
So we can expect our uses to have a window width of about 1258 and 50% of our users have a browser width in the range 1,1263.
For a better view, we can plot a histogram:
> hist(win)
or the distribution function F:
> plot(ecdf(win),do.points=FALSE,verticals=TRUE)
or the density function:
> plot(density(win))
We can see peaks at the default resolutions.
If we want to know which resolution we can choose so that a part x of our users will have to scroll, we have to look
at the x-quantile of the win distribution:
> quantile(win) 0% 25% 50% 75% 100% 1 1027 1263 1358 2545 > quantile(win,.1) 10% 1000 > quantile(win,.2) 20% 1007 > quantile(win,.3) 30% 1127 > quantile(win,.4) 40% 1255
So 10% will have to scroll horizontally if we choose a site width of 1000, 20% for 1007, 30% for 1127 and 40% for 1255. If we design
for 1024 px screen size, i.e. about 990 px Web site width, only 10% our users will have to scroll. However, if we design for 1127 px
site width, 30% of the users will have to scroll.
However, this can't be only aspect that is taken into consideration because maybe it's more important to pack content onto the page
or to optimise it for larger screens (and window sizes). This is up to you, but the statistical analysis gives you a hint.
At the end, we can determine the correlation (by Pearson) between window and browser sizes:
> cor(win,screen) [1] 0.1900401
=> The linear (!) correlation factor between window and browser size is 19%, that means there is no good linear correlation (in
the form window size = A * browser size +
. So I think that most people don't work in full-screen mode, because in full-screen mode,
window size = browser size + B (so it would be linear).
Problem:
My company needs to monitor servers, services, switches, UPS. The target of this task is to setup a monitoring system, which is able to check the devices and services and send alarms to several people. There should be a difference between critical and non-critical services and devices.
Preface:
Everything you do here, happens at your own risk!
I'm using FreeBSD 7.2 for this task, to be more precise a jailed instance of it. So you should be able to install FreeBSD and update it. Please update your Ports before starting to be sure that you have the newest version of Nagios. I'll describe in an other post how to update your system.
Tip: Back up, because you will break something!
Solution:
As you can see in the title I decided to use Nagios. You can find a lot of resources at:
http://www.nagios.org/
http://www.monitoringexchange.org/
http://nagios.manubulon.com/
Installing
Okay lets start our actual work! We'll use something to see the output of our work, so the standard Apache will do that for us.
>cd /usr/ports/www/apache22 && make install clean
Just use the standard setting for the Apache server, you don't need to change the package for this task.
Now we need Nagios:
>cd /usr/ports/net-mgmt/nagios && make install clean
Enable the embedded Perl package and hit okay, non of the x11 packages are needed, as long you don't use x11. When the installer asks you which packages should be compiled for php you have to check Apache, or the mod_php module won't be complied.
For the Nagiosplugins enable all, just a few megabytes of disk space are needed. FreeBSD will fetch them from sourceforge.
Okay, keep waiting a little bit, depends on the power of core/s, but the installer will ask you if you want to create a group "nagios". Answer Yes. After that you'll be ask to create a user called "nagios". Answer Yes. A few moments later the nagiosistaller is finished, and it gives you some advices we'll follow now.
Configuration
Fist of all, we'll edit the httpd.conf of the Apache. This is needed that the GUI of Nagios can be displayed (porperly).
>vi /usr/local/etc/apache22/httpd.conf
Check if the Phpmodul is implemented:
LoadModule php5_module libexec/apache22/libphp5.so
To enable cgi, delete the # in front of the line, maybe you can add the .pl extension, if you want to run perlscripts
AddHandler cgi-script .cgi .pl
Now search for the section and add:
ScriptAlias /nagios/cgi-bin/ /usr/local/www/nagios/cgi-bin/ Alias /nagios/ /usr/local/www/nagios/
As I don't describe any security issue here, we'll make Nagios visible for all. If you want to restrict it, please read the Apache manual: http://httpd.apache.org/docs/2.0/howto/auth.html
With that in mind we add the lines for the static Nagios page:
Order deny,allow
Allow from all
php_flag engine on
php_admin_value open_basedir /usr/local/www/nagios/:/var/spool/nagios/and for the CGI-Application:
Options +ExecCGI
Okay now it should be possible to start the web server:
>apachectl start
If you working in a Jail the ps -ax doesn't work properly, so just type http://IP of your server/nagios] into the address line of your browser. The rest should be up to Apache and you should see something like this:
If the web server doesn't start in the jail you maybe forgot to load a kernelmodul "accf_http". You can make sure if it's loaded using
>kldstat | grep accf
You should see something like:
5 1 0xc6c22000 2000 accf_http.ko
Kernelmoduls can't be loaded in a jail, you have the to that on the jailhost:
>kldload accf_http
Congratulation, you installed Nagios, but you cannot monitor anything now. You installed the static website, but now we have to get the Nagios service up.
Nagiosservice
Preparation:
So I'll try to give you a crash course using Nagios. But before starting it, please be sure you know what SNMP is and how to use it and how to snmpwalk.
You should find your config files under /usr/local/etc/nagios
Here are the nagios.cfg-sample, cgi.cfg-sample and the resource.cfg-sample. Copy and rename this files to another location, in my case the samplefolder. The name of the copied files should be nagios.cfg, cgi.cfg and so on.
>mkdir sample >cp *.cfg-sample sample/ >mv nagios.cfg-sample nagios.cfg
or use my renamescript. Now you should have 3 files and 2 folders: sample and objects.
Lets go to the object folder and do exactly the same:
>cd objects >mkdir sample >cp *.cfg sample/ >mv commands.cfg-sample commands.cfg-sample
and so on...
In the end you should have a fileset like this:
commands.cfg
contacts.cfg
localhost.cfg
printer.cfg
sample
switch.cfg
templates.cfg
timeperiods.cfg
Actual work
Time of mindless copy paste is now over, you have to start thinking.
The hole thing with nagios is knowing inheritance. The Nagiosteam did a lot for you, so let's have a look. To be able to view all host and services edit the cgi.cfg and set the parameter
use_authentication=0
from 1 to Zero, or you'll get an error message. (But as the comments in this file say this is for a producing system a bad idea)
You can find the Nagios configuration under /usr/local/etc/nagios/nagios.cfg
Here you just need to define which object typs should be used. For example:
# You can specify individual object config files as shown below: cfg_file=/usr/local/etc/nagios/objects/commands.cfg cfg_file=/usr/local/etc/nagios/objects/contacts.cfg cfg_file=/usr/local/etc/nagios/objects/timeperiods.cfg cfg_file=/usr/local/etc/nagios/objects/templates.cfg # Definitions for monitoring the local (FreeBSD) host cfg_file=/usr/local/etc/nagios/objects/localhost.cfg
In these file the behavior of Nagios is defined. We'll add some of our object files a little bit later to monitor a windows server. But for now, use this file set. Let's see what these files for.
commands.cfg
The commands used by nagios are defined here, to check hosts and how to send mails
templates.cfg
Here start the inheritance. This file is very important, because the "skeleton" of the things to monitor are defined here.
contacts.cfg
If a host switches to a warning and critical stat somebody have to be contacted. These contacts are defined here.
printer.cfg
Ready to use script to monitor printers.
switch.cfg
Ready to use script to monitor printers.
timeperiods.cfg
Defines when a staff member should be alarmed
Let's monitor the localhost!
If you've done everything like i told you
>/usr/local/bin/nagios /usr/local/etc/nagios/nagios.cfg
will show something like:
Nagios Core 3.2.0 Copyright (c) 2009 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 08-12-2009 License: GPL Website: http://www.nagios.org Nagios 3.2.0 starting... (PID=66618) Local time is Wed Jan 27 12:05:58 UTC 2010
and you should be glad!
Have a look at you web server and click on host groups:
Let's monitor a printer!
Start with the easy stuff.
>vi /usr/local/etc/nagios/nagios.cfg
and delete the hashmark
cfg_file=/usr/local/etc/nagios/objects/printer.cfg
Now you told Nagios to read printer.cfg on start up.
>cd usr/local/etc/nagios/objects/templats.cfg
Here you find the section about a generic host:
define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}and some line below the printer definiton:
define host{
name generic-printer ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, printers are monitored round the clock
check_interval 5 ; Actively check the printer every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each printer 10 times (max)
check_command check-host-alive ; Default command to check if printers are "alive"
notification_period workhours ; Printers are only used during the workday
notification_interval 30 ; Resend notifications every 30 minutes
notification_options d,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE
}As you can see the generic-printer inheritanced from the generic-host. So if you make changes in generic-host, the genric-printer skeleton will change to! You can override attributes, just by setting the attribute one level depper. So if you add
retain_status_information 0
to the generic printer it will override the 1 inheritances from the generic host and so on.
I want to monitor a HP 3600n Laserjet, so I'll:
>vi /usr/local/etc/nagios/objects/printer.cfg
and add a new host:
define host{
use generic-printer
host_name NikosPrinter
alias HP3600n @ ITroom
address 100.100.100.101
hostgroups network-printers
notes_url http://100.100.100.66/wiki/index.php/Drucker
action_url http://100.100.100.185
}The host get his standard setting from the generic-printer, which get his standard setting from generic-host.
Use: where you get the settings
Host_name: how the host is named in nagios
alias: more information in the webinterface
address: IP or FQDN(but prefer IP)
Hostgroup: use to group host, if you got mor printers
notes_url: a link to our internal wiki, where you got more informations
action_url: a link to the webinterface of the printer
Okay host is defined, now the services:
define service{
use generic-service
host_name NikosPrinter
service_description Printer Status
check_command check_hpjd!-C public
normal_check_interval 10 ; Check the service every 10 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined
notification_interval 0
}service_description: Name of the service in the webinterface
normal_check_interval: Check the service every 10 minutes under normal conditions
retry_check_interval: Re-check the service every minute until its final/hard state is determined
notification_interval: How often you receive a mail, but I don't want to get spammed by printers so I think one mail I enogh
define service{
use generic-service
hostgroup_name network-printers
service_description PING
check_command check_ping!3000.0,80%!5000.0,100%
normal_check_interval 10
retry_check_interval 1
notification_interval 0
}Okay the same as above but here I use a hostgroup to ping instad of a hostname.
Lets have a look at the commands:
check_command check_hpjd!-C public
Arguments are separated by “!â€
How to use standard markos can be found in the Nagios documentation, so I'll don't go any further with this.
and from commands.cfg
define command{
command_name check_hpjd
command_line $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$
}
Restart nagios:
>ps -ax | grep nagios >kill [pid] >/usr/local/bin/nagios /usr/local/etc/nagios/nagios.cfg
This should result in:
Installed Nagios but Nagios was quit voiceless, so I want to send notifications via sendmail. Sendmail is installed by default with Freebsd and as I don’t wanted to build a hole mail server from scratch I want to use our MS Exchange server.
First of all check the DNS Setting of your Domain and make sure your FreeBSD machine got a DNS (A) Record.
To make sure it works you should get this response:
>hostname nagios.domain.at >host nagios.domain.at nagios.doman.at has address 100.100.111.111
If you don’t get this, have a look at DNS table.
Now allow SMTP connection from you BSD host to the exchange server as seen in the screener below .
After that you have to tell the Freebsdhost to use an alternative agent
> vi /etc/mail/freebsd.submit.mc
change the line
FEATURE(`msp', `[127.0.0.1]')dnl
To
FEATURE(`msp', `[Ip of the Exchange server]')dnl
Or use instead of IP the FQDN (not tested by me)
Now install the submit.cf
>cd /etc/mail && make install-submit-cf
If everything worked fine, you should see an output like this:
cp freebsd.submit.mc [hostname].[domain].submit.mc /usr/bin/m4 -D_CF_DIR_=/usr/share/sendmail/cf/ /usr/share/sendmail/cf/m4/cf.m4 [hostname].[domain].submit.mc > [hostname].[domain].submit.cf install -m 444 [hostname].[domain].submit.cf /etc/mail/submit.cf
Now test the mailsending via with ‘mail’
>mail –v [exchangeuser@domain]
Now you should see a lot of lines starting with
[exchangeuser@domain]... Connecting to [100.100.111.111] via relay... 220 EXCHANGE.doman.dc Microsoft ESMTP MAIL Service, Version: 6.0.3790.3959 ready
Now check your Mail client, there should be a mail sent from your logged in user
since we moved our nfs-servers from standalone AIX to netapp, we need a separate monitoring
I wrote this script, it runs on the xymon server, and appears as a server test
you need to change the (v)filers etc/host.equiv file, to allow rsh access from your xymon server
If that's a problem, this script may run from any system, where the bb client binary is available for.
alot of the code ist just for nice xymon formatting, it can easily be optimized for performance, without using temp files
#!/usr/bin/ksh
#Daffi 2009
#Script reads quota report from a netapp vfiler over rsh
#change vfilers host.equiv, and add the ipaddress from the rsh client, xymon server in most cases
#for testing purpose, should be set in your xymon environment
#BBHOME="/home/hobbit/client"
#BB="${BBHOME}/bin/bb"
#BBDISP=xxx.xxx.xxx.xxx
#MACHINE=xxx
TESTNAME="netappquota"
FILERNAME="xxx"
#critial and fatal severity limit, critcal=yellow, fatal=red
CLIMIT="70"
FLIMIT="80"
#don't change these values, default status
STAT="green"
STATUSTEXT="all Qtree Quotas OK"
#remove files from the last test
find $BBHOME/tmp -type f -name "netapp*.txt" -exec rm {} \;
rsh ${FILERNAME} "quota report" | awk '/vol/{print $4,$5,$6,$9}' | while read QTREE USED QUOTA RPATH
do
PERC=$(echo "scale=3;(${USED} * 100) / $QUOTA" | bc)
stf=$(echo "${PERC} >= ${FLIMIT}" | bc)
stc=$(echo "${PERC} >= ${CLIMIT}" | bc)
if [ ${stf} -eq 1 ]
then
echo "&red ${RPATH} ${QTREE} ${PERC}% ${USED} ${QUOTA}" >> $BBHOME/tmp/netappquotared.txt
elif [ ${stc} -eq 1 ]
then
echo "&yellow ${RPATH} ${QTREE} ${PERC}% ${USED} ${QUOTA}" >> $BBHOME/tmp/netappquotayellow.txt
else
echo "&green ${RPATH} ${QTREE} ${PERC}% ${USED} ${QUOTA}" >> $BBHOME/tmp/netappquotagreen.txt
fi
done
if [ -f $BBHOME/tmp/netappquotayellow.txt ] ; then STAT=yellow ; STATUSTEXT="One or more Quotas over defined CRITICAL level (${CLIMIT}%)" ; fi
if [ -f $BBHOME/tmp/netappquotared.txt ] ; then STAT=red ; STATUSTEXT="One or more Quotas over defined FATAL level (${FLIMIT}%)" ; fi
${BB} ${BBDISP} "status ${MACHINE}.${TESTNAME} ${STAT} ${STATUSTEXT}
$(echo "Path Qtree %used used[MB] quota[MB]" | awk '{printf ("%-43s" "%-30s" "%-15s" "%-15s" "%-15s\n",$1,$2,$3,$4,$5,$6)}')
$([ -f $BBHOME/tmp/netappquotared.txt ] && cat $BBHOME/tmp/netappquotared.txt | sort -rnk 4 | awk '{printf ("%-5s" "%-40s" "%-30s" "%-15s" "%-15.2f" "%-15.2f\n",$1,$2,$3,$4,$5/1024,$6/1024)}' && echo " ")
$([ -f $BBHOME/tmp/netappquotayellow.txt ] && cat $BBHOME/tmp/netappquotayellow.txt | sort -rnk 4 | awk '{printf ("%-8s" "%-40s" "%-30s" "%-15s" "%-15.2f" "%-15.2f\n",$1,$2,$3,$4,$5/1024,$6/1024)}' && echo " ")
$([ -f $BBHOME/tmp/netappquotagreen.txt ] && cat $BBHOME/tmp/netappquotagreen.txt | sort -rnk 4 | awk '{printf ("%-7s" "%-40s" "%-30s" "%-15s" "%-15.2f" "%-15.2f\n",$1,$2,$3,$4,$5/1024,$6/1024)}')
"
Sidebar
Wiki
Sidebar
Last blog posts
-
tivoli itm 6.2 change agent hostname to other then original system hostname
Tue 15 of Mar., 2011 19:17 CET
-
X11 secure display forwarding via ssh error
Tue 15 of Mar., 2011 19:03 CET
-
Android 2.2.1 delay between accepting call and actually hearing the caller
Tue 15 of Mar., 2011 18:50 CET
-
AIX: Get PVID directly from hdisk using od
Thu 15 of Apr., 2010 15:25 CEST
-
Power Blade: Add additional vscsi adapters to lpar / vhosts to vio server
Wed 24 of Mar., 2010 09:49 CET
-
Bug in xymon 4.3.3 Beta 2 splitncv
Tue 23 of Feb., 2010 10:11 CET
-
Analysing screen and browser window sizes reported by AWStats using R
Thu 18 of Feb., 2010 14:14 CET
-
Nagios on FreeBSD
Wed 27 of Jan., 2010 13:24 CET
-
Freebsd sendmail via Exchangeserver
Thu 17 of Dec., 2009 10:39 CET
-
xymon netapp vfiler quota monitoring
Tue 24 of Nov., 2009 17:55 CET

Last blog post comments