One of my customers recently asked me to monitor its UPS, a MGE Galaxy 5000. This model includes a network management card that let me monitor this device using SNMP.
In this example I’ll only explain how I did to create a Trap Based Monitor.
When looking at the UPS MIB, I noticed that Traps were coming by pairs, one for failure and one for restore, for example:
- upsmgByPassUnavailable : 220.127.116.11.4.1.705.1.11.15
- upsmgByPassAvailable : 18.104.22.168.4.1.705.1.11.16
- upsmgUtilityFailure : 22.214.171.124.4.1.705.1.11.17
- upsmgUtilityRestored : 126.96.36.199.4.1.705.1.11.18
Glad to discover I would not only create basic alert rules that would let my device in a green state with 10 critical alerts but I could make monitors which would affect the state of my device.
First I created a simple SNMP Event (trap) collection rule with no filter so that I can see all traps coming from my UPS. I made tests with the upsmgUtilityFailure and upsmgUtilityRestored traps as I could easily trigger them from the UPS management interface:
OK I can receive traps. Now I could create the monitor (remember not to save it in the Default Management Pack). Choose this Simple Trap Detection:
Enter the name and description you want and make sure the Monitor Target is SNMP Network Device. You can notice that my monitor is disabled by default, it’s because I only wanted that monitor to run for my UPS so I created a group containing my device and after the monitor is created,I override it for my group:
On the First SnmpTrapProvider page just check the all trap checkbox:
On the next page use the expression /DataItem/SnmpVarBinds/SnmpVarBind/Value Equals 188.8.131.52.4.1.705.1.11.0.17. I took the synthax from David Allen’s Blog :
Why did I use SnmpVarBind in my expression ? When you look back to the first screenshot of this post you can see the event data at the bottom and you can see that the OID I want to apply a filter on is the third parameter.
On the Second SnmpTrapProvider page check the all trap checkbox again.
On te next page use the same expression as above but with the second OID:
Configure a warning or a critical state for the first event:
and finally configure an alert and validate:
Once you have created the monitor (and in my case made an override on it) you can appreciate the result: