Posts Tagged ‘websphere’
* how to monitor ibm mq from nagios
Posted on October 21st, 2008 by doug. Filed under websphere.
This was one of the search terms that found an article here… I hadn’t addressed this directly, but I use Nagios to monitor my company’s server environment, and specifically implemented that monitoring for IBM Websphere MQ.
For MQ, I run nagios monitoring against queue depth and processes. I installed three plugins to run against WebSphere. Of these one was developed for my company’s needs (qdepth), one was changed slightly (channels) and the last debugged, found not to actually measure accurately, and not resolved (message age).
Here’s the nagios console for the websphere MQ server. “message age” in the second qdepth check service title is deceptive – actually checking qdepth…
This is the commands section from the nrpe.cfg file on the WebSphere MQ server.
command[check_mq_channel]=/usr/local/nagios/libexec/check_mq_channel.sh $ARG1$ $ARG2$
command[check_mq_msgage]=/usr/local/nagios/libexec/check_mq_msgage.sh $ARG1$ $ARG2$ $ARG3$ $ARG4$
command[wmq_check_qdepth]=/usr/local/nagios/libexec/wmq_check_qdepth.pl $ARG1$ $ARG2$ $ARG3$
Of these we only really using qdepth monitoring. The channels come up triggered, so an inactive state is fine, and the plugin as written only tests for “running”. The message age plugin, as I mentioned, doesn’t actually work.
When I first looked at setting this messaging up and then monitoring it, I searched for “nagios monitoring MQ webshere” and found several pre-written plugins. I took each plugin and tested it for usability and for accurate results and for meeting what we needed for monitoring.
The message age plugin, in testing, actually returned a hard-coded result rather than actually testing and returning a valid answer. I started to fix it, set it aside and haven’t resolved it. I don’t recall the source for the plugin. Check each piece of code you download from the internet – it may have gone through extensive development and testing, or it could just as easily have been hacked together in an hour. Your mileage may seriously vary and I would highly recommend you verify any of this before you bet your job on it.
Here’s the qdepth plugin – I think I wrote or re-wrote this pretty much from scratch, but the original concept for parsing runmcsc came from one of the plugins I downloaded, written by Kyle O’Donnell – the channel plugin has his original author credit intact. This plugin has alerted once to an increasing qdepth, which turned out to be an issue with an SSL certificate.
#! /bin/perl
## wmq_check_qdepth.pl
#
# nrpe (nagios) script to check websphere qdepth
# uses runmqsc binary
#
# display queue ('APP.REQUEST')
# 8 : display queue ('APP.REQUEST')
# AMQ8409: Display Queue details.
# QUEUE(APP.REQUEST) TYPE(QLOCAL)
# ACCTQ(QMGR) ALTDATE(2008-01-22)
# ALTTIME(14.18.23) BOQNAME( )
# BOTHRESH(0) CLUSNL( )
# CLUSTER( ) CLWLPRTY(0)
# CLWLRANK(0) CLWLUSEQ(QMGR)
# CRDATE(2008-01-22) CRTIME(14.18.23)
# CURDEPTH(0) DEFBIND(OPEN)
# DEFPRTY(0) DEFPSIST(NO)
# DEFSOPT(SHARED) DEFTYPE(PREDEFINED)
# DESCR( ) DISTL(NO)
# GET(ENABLED) HARDENBO
# INITQ( ) IPPROCS(0)
# MAXDEPTH(5000) MAXMSGL(4194304)
# MONQ(QMGR) MSGDLVSQ(PRIORITY)
# NOTRIGGER NPMCLASS(NORMAL)
# OPPROCS(0) PROCESS( )
# PUT(ENABLED) QDEPTHHI(80)
# QDEPTHLO(20) QDPHIEV(DISABLED)
# QDPLOEV(DISABLED) QDPMAXEV(ENABLED)
# QSVCIEV(NONE) QSVCINT(999999999)
# RETINTVL(999999999) SCOPE(QMGR)
# SHARE STATQ(QMGR)
# TRIGDATA( ) TRIGDPTH(1)
# TRIGMPRI(0) TRIGTYPE(FIRST)
# USAGE(NORMAL)
### Variables ###
# test values set if this flag is true (1)
### THIS MUST BE SET TO 0 IN PRODUCTION!!! ###
my $test = 0;
# debug flag (adds messages)
my $debug = 0;
my $LOG = "/tmp/wmq_check_qdepth.pl.log";
# runmqsc binary
my $MQSC = "/opt/mqm/bin/runmqsc";
### ARGS ###
# first argument is warn level
my $WARN = shift;
# second arg is crtitical level
my $CRIT = shift;
# third arg is queue name
my $QUEUE = shift;
# set for dev purposes
if ($test) {
$WARN = 5;
$CRIT = 10;
$QUEUE = "1A33.EVG.REQUEST";
}
# validate
# WARN and CRIT must be greater than 0 and CRIT must be greater than WARN
unless (($WARN > 0) && ($CRIT > 0)) {
print ("Command Failed: WARN and CRIT levels must be greater than 0\n");
exit 3;
}
unless ($CRIT > $WARN) {
print ("Command Failed: CRIT must be greater than WARN\n");
exit 4;
}
### Subs ###
### MAIN ###
# run query
my $result = `echo "display queue ('${QUEUE}')" | $MQSC | grep CURDEPTH`;
print ("result: $result\n") if $debug;
# parse result
my @lines = split ("\n", $result); # divide into an array by end of line...
# each element of the array will contain a single line
# set variables
my ($PARAM, $VALUE);
for my $line (@lines) {
# each line is one or two elements like "QDPLOEV(DISABLED) QDPMAXEV(ENABLED)"
# divide those...
my ($first, $discard) = split (' ', $line);
print ("\$first: $first \$discard $discard\n") if $debug;
($PARAM, $VALUE) = split ('\(', $first);
$VALUE =~ s/\)//;
print ("\$PARAM: $PARAM \$VALUE: $VALUE\n") if $debug;
}
# testing value
$VALUE = 13 if $test;
# check for $WARN and $CRIT levels, exit 0 as OK, 1 as warn or 2 as critical
if ($VALUE == 0) {
print ("OK: found qdepth for $QUEUE at 0\n");
exit 0;
} elsif ($VALUE < $WARN) {
print ("OK: found qdepth for $QUEUE at $VALUE\n");
exit 0;
} elsif (($VALUE >= $WARN) && ($VALUE < $CRIT)) {
print ("WARN: qdepth of $QUEUE is at $VALUE: exceeds WARN thresh of $WARN\n");
exit 1;
} elsif ($VALUE >= $CRIT) {
print ("CRITICAL: qdepth for $QUEUE at $VALUE: exceeds CRITICAL thresh of $CRIT\n");
exit 2;
}
This is the channel status plugin – I may have re-written the original data gathering runmssc string, but the majority of the plugin remained intact…
#!/bin/ksh
#
# check queue manager status
#
# Kyle O'Donnell
#
#$Id: check_mq_channel,v 1.2 2007/04/04 14:36:02 kodonnel Exp $
#
# debug
DATE=`date`
LOG="/tmp/nrpe_check_mq_channel.sh.log"
echo "" >> $LOG
echo $DATE >> $LOG
echo "" >> $LOG
[ $# -ne 2 ] && echo "usage: $0 " && exit 3
channel=$1
qmgr=$2
echo "channel: $channel qmanager: $qmgr" >> $LOG
RUNMQSC="/opt/mqm/bin/runmqsc"
chanstatus=`echo "dis chs(${channel}) status" | ${RUNMQSC} ${qmgr} | grep -i "status(running)"`
echo "channel status result: $chanstatus" >> $LOG
if echo $chanstatus |grep -i "status(running)" > /dev/null 2>&1; then
STATE=0
printf "${channel} on ${qmgr} running"
echo ""
echo ""
else
STATE=2
printf "${channel} on ${qmgr} not running"
echo ""
echo ""
fi
echo "state: $STATE" >> $LOG
exit $STATE;
Here’s the server.cfg file for the Websphere MQ machine on the nagios server:
define service {
use generic-service
host_name mq1
service_description Host Alive
check_period 24x7
contact_groups unix-administrators
notification_period 24x7
check_command check-host-alive
}
define service {
use generic-service
host_name mq1
service_description Sonic Bridge java process
check_period 24x7
contact_groups esb-administrators
notification_period 24x7
check_command check_unix_proc!mqm!1!java
}
define service {
use generic-service
host_name mq1
service_description SSB queue depth EVGPQM01.DEAD.QUEUE message age
check_period 24x7
contact_groups systems-services,help_desk
notification_period 24x7
check_command wmq_check_qdepth!1!3!QMGR01!QMGR01.DEAD.QUEUE
}
define service {
use generic-service
host_name mq1
service_description server queue depth APPLICATION.RESPONSE
check_period 24x7
contact_groups systems-services,help_desk
notification_period 24x7
check_command wmq_check_qdepth!5!10!APPLICATION.RESPONSE
}
define service {
use generic-service
host_name mq1
service_description server queue depth OPPOSITE-QMGR
check_period 24x7
contact_groups systems-services,help_desk
notification_period 24x7
check_command wmq_check_qdepth!5!10!OPPOSITE-QMGR
}
define service {
use generic-service
host_name mq1
service_description WMQ command server
check_period 24x7
contact_groups systems-services,help_desk
notification_period 24x7
check_command check_unix_proc!mqm!1!amqpcsea
}
define service {
use generic-service
host_name mq1
service_description WMQ Critical process manager
check_period 24x7
contact_groups systems-services,help_desk
notification_period 24x7
check_command check_unix_proc!mqm!1!amqzmuc0
}
The strategy is to monitor qdepth and processes specific to IBM WebSphere MQ on the Websphere MQ server, along with the normal UNIX processes and disk space.
— dsm
* refreshing SSL certificates in websphere MQ
Posted on July 30th, 2008 by doug. Filed under websphere.
The first project I was given when I started at Evergreen Investments involved IBM’s WebSphere MQ messaging application. I took a development instance of the application and translated that to the requirements for a production deployment of the application.
It has been completely bulletproof. Set up correctly and sized appropriately, it just works. Eventually all good things come to an end, and you must maintain the service. SSL certificates expire, and must be replaced with new certificates.
In WebSphere versions prior to 6.0, you had to restart the queue manager, the local god of the application, to have the new certificate information deploy. That was somewhat disruptive, but it ensured that the information you were using for bringing up encrypted channels of communication was the new information, not the old.
With version 6.0, there is a command you can run in runmqsc (the WebSphere MQ command line shell and script interface) –
REFRESH SECURITY TYPE(SSL)
The same command run as
REFRESH SECURITY
will touch USER security, but leave the SSL certificate information untouched and unrefreshed. Thus leaving you with an invalid certificate in place, even though you have replaced and refreshed (you think) the information…
This is obvious in hindsight. And completely baffling, until you find it, while trying to get the application to come back up after replacing an expired certificate.
—dsm
recent posts
- home to Boston, daughter in remission
- visually healthy bone marrow…
- matter of the lungs
- Fall through code to a success…
- another tool for SVN – list_repositories.pl
- svnadmin.pl – perl cgi script to manage svn over apache
- testing Crosspress (plugin)…
- subversion compile and install as non-privileged user…
What I'm Doing...
- flying back to Boston tomorrow, and watching my daughter come off a ventilator and breathe on her own... 1 week ago
- writing a startup and shutdown sc ript for all of jboss-land 2009-08-25
- finished and deployed svnadmin.pl cgi, documented it and checked into subversion... next is more log4j edits, and deploy jsvn (java svn) 2009-05-08
- More updates...
Posting tweet...

