Hello all, We have a Splunk alert that searches for high temperature events on Juniper routers, it's a very straight forward search: index=main CHASSISD_FRU_HIGH_TEMP_CONDITION OR CHASSISD_OVER_...
See more...
Hello all, We have a Splunk alert that searches for high temperature events on Juniper routers, it's a very straight forward search: index=main CHASSISD_FRU_HIGH_TEMP_CONDITION OR CHASSISD_OVER_TEMP_SHUTDOWN_TIME OR CHASSISD_OVER_TEMP_CONDITION OR CHASSISD_TEMP_HOT_NOTICE OR CHASSISD_FPC_OPTICS_HOT_NOTICE OR CHASSISD_HIGH_TEMP_CONDITION OR (CHASSISD "Temperature back to normal") NOT UI_CMDLINE_READ_LINE I'd like this Splunk alert to ignore temperature alarm events on the host router4-utah when FPC 11 = FPC: MPC5E 3D 24XGE+6XLGE @ 11/*/* is running hot, the events always come in the following order within 25 seconds of each other: The alarm trigger events: Sep 27 05:26:00 re0.router4-utah chassisd[7726]: CHASSISD_BLOWERS_SPEED_FULL: Fans and impellers being set to full speed [system warm]
Sep 27 05:26:00 re0.router4-utah alarmd[7895]: Alarm set: Temp sensor color=YELLOW, class=CHASSIS, reason=Temperature Warm
Sep 27 05:26:00 re0.router4-utah craftd[7730]: Minor alarm set, Temperature Warm
Sep 27 05:26:00 re0.router4-utah chassisd[7726]: CHASSISD_HIGH_TEMP_CONDITION: Chassis temperature over 60 degrees C (but no fan/impeller failure detected)
Sep 27 05:26:02 re0.router4-utah chassisd[7726]: CHASSISD_SNMP_TRAP6: SNMP trap generated: Over Temperature! (jnxContentsContainerIndex 7, jnxContentsL1Index 12, jnxContentsL2Index 0, jnxContentsL3Index 0, jnxContentsDescr FPC: MPC5E 3D 24XGE+6XLGE @ 11/*/*, jnxOperatingTemp 91) The alarm clear events: Sep 27 05:26:21 re0.router4-utah alarmd[7895]: Alarm cleared: Temp sensor color=YELLOW, class=CHASSIS, reason=Temperature Warm
Sep 27 05:26:21 re0.router4-utah craftd[7730]: Minor alarm cleared, Temperature Warm The goal is to keep the normal temperature alert running as it always has, but somehow ignore the host router4-utah when it triggers and clears temperature alarms on FPC11. I think the easiest way to say this is any temp alarm that triggers and clears on router4-utah that is surrounded within 25 seconds of this line: Sep 27 05:26:02 re0.router4-utah chassisd[7726]: CHASSISD_SNMP_TRAP6: SNMP trap generated: Over Temperature! (jnxContentsContainerIndex 7, jnxContentsL1Index 12, jnxContentsL2Index 0, jnxContentsL3Index 0, jnxContentsDescr FPC: MPC5E 3D 24XGE+6XLGE @ 11/*/*, jnxOperatingTemp 91) Any assistance one can provide is much appreciated! Thanks.