SUMMARY: the ISE TA add-on has been working great for months, however an odd problem was recently discovered with the ISE TA add-on where logs sent by ISE for two smaller WLC (wireless LAN controllers) are being sent by ISE with a NULL value on just one field, which unfortunately appears to break the message parsing from that point forward just on those specific CISE_Passed_Authentications logs. The same type of logs (same authentication, same policy, same EVERYTHING on ISE) are sent by ISE for the much larger WLC get parsed as expected, nothing breaks, nothing is missing.
QUESTION: Im looking for any advice on how to get the ISE TA add-on to ignore the single NULL value and move on to parse rest of the log message sent by ISE despite the NULL value.
SORDID DETAILS: The issue was not easy to see because most of our ISE logs come from one big fat controller cluster - a big ol' Cisco Flexcontroller WLC configured to handle dozens of large sites. This larger WLC does the bulk of authentications to ISE. However, we have two smaller local controllers at specific remote sites. All three WLCs are on the exact same code version, all WLC settings have been verified two or three times now. Also it is KEY to understand that all WLC large and small use the exact same policy on ISE - there are no differentiators regarding NAS-ID, NAS device or anything else. A user authenticating through the large WLC uses the same identical ISE policies as a user authenticating at one of the smaller WLC sites. You should probably disregard the fact there are three WLCs (one large, two small), Im only relaying this info to show that Splunk TA can and does parse these logs correctly, in the cases where there is no NULL value. The Splunk ISE TA app just needs to be able to handle the NULLs gracefully when/if they're encountered (which it currently does not).
As I later discovered, the parsing problem is clearly visible when comparing ISE logs of the large WLC controller with the smaller WLC's ISE logs: the normal CISE_Passed_Authentication logs for the large WLC coming from ISE are humongous, 30+ lines on Splunk and probably the full 8192 bytes maximum, compared to the logs whose parsing breaks which are only about 6 lines on Splunk's GUI. The ISE syslog size was set to the Maximum Length of 8192 months ago from the start per the documentation.
All the "short logs" get cut-off at the exact same point on the Splunk GUI, after a field called "attribute-89= ," shown here (NOTE: this is just a snippet to illustrate the issue, the ellipses ........ indicate text before/after)
AS SEEN ON SPLUNK, PARSING STOPS HERE
...........NAS-Port-Type=Wireless - IEEE 802.11, Tunnel-Type=(tag=0) VLAN, Tunnel-Medium-Type=(tag=0) 802, Tunnel-Private-Group-ID=(tag=0) 1, attribute-89= ,
SAME RAW LOG IS MUCH LARGER
..........NAS-Port-Type=Wireless - IEEE 802.11, Tunnel-Type=( tag=0) VLAN, Tunnel-Medium-Type=(tag=0) 802, Tunnel-Private-Group-ID=(tag=0) 1, attribute-89= , attribute-131=00:00:00:01, cisco-av-pair=audit-session-id=0a40016400020ebe5c5ca1e2, cisco-av-pair=mDNS=true, Airespace-Wlan-Id=100, ..........
After this field called "attribute-89= ," there is no more message on Splunk, it stops right there, and thus the rest of the parsed fields are not present on those logs - ie: Splunk correctly parses all fields up to that point, but nothing after that, as if the message ends (and it looks that way on the GUI too, no more text after that last comma).
First I had to confirm the entire ISE syslog was getting to Splunk - that it wasn't somehow being prematurely cut off by ISE before being sent to Splunk. So I configured ISE to send duplicate syslogs: one to a regular Linux server, and one to Splunk. This confirmed that the log message itself is not being cut-off by ISE - the entire log message showed up. This allowed me to validate that there is a lot more data after this "attribute-89= ,".
Moreover, searching those raw syslogs on the Linux box also helped me to confirm that at no point is there ever an empty value pair OTHER THAN with this "attribute-89". IOW, only this field comes paired with a NULL value. Which is why I've arrived at the conclusion that ISE is sending a bad value pair which causes Splunk to stop parsing right there, and disregards the rest of the large log message (guess it gets tossed or something).
I opened a case with Cisco TAC, hoping for a missed configuration somewhere. We combed through the smaller WLC's configs, then the ISE config, TAC looked at the same thing Im describing here and found nothing out of place. They're still working on it, actually. But to me it's becoming clearer that the NULL value seems to be an ISE bug/error of some sort. I suspect there is little chance Cisco is going to fix this in code anytime soon, but im fighting the good fight.
However, the reason Im here is to ask how to get the Splunk TA ISE add-on to bypass this one empty/NULL value pair to allow the ISE TA parser to continue cranking through the rest of the message and disregard any encountered NULL values. The fact the ISE TA add-on doesnt have a way to work through this is a problem itself, separate from whatever issue is leading ISE to send a NULL value for the "attribute-89" field. BTW: to be clear, I have no idea what this "attribute-89" field is nor do I care - i care about the REST OF THE JUICY DATA that comes after this value.
Ideally, the ISE TA add-on would gracefully handle this and not stop parsing right there, it would be nice if this was configured into the add-on to prevent this type of problem on other ISE syslogs. Im looking for any advice on how to get the ISE TA add-on to ignore that NULL value and move on so the rest of the log gets properly parsed. Thanks for any help.
... View more