All Apps and Add-ons

Null Value Breaks Log Parsing on Splunk ISE TA CISE_Passed_Authentications

pjasa
New Member

SUMMARY: the ISE TA add-on has been working great for months, however an odd problem was recently discovered with the ISE TA add-on where logs sent by ISE for two smaller WLC (wireless LAN controllers) are being sent by ISE with a NULL value on just one field, which unfortunately appears to break the message parsing from that point forward just on those specific CISE_Passed_Authentications logs. The same type of logs (same authentication, same policy, same EVERYTHING on ISE) are sent by ISE for the much larger WLC get parsed as expected, nothing breaks, nothing is missing.

QUESTION: Im looking for any advice on how to get the ISE TA add-on to ignore the single NULL value and move on to parse rest of the log message sent by ISE despite the NULL value.

SORDID DETAILS: The issue was not easy to see because most of our ISE logs come from one big fat controller cluster - a big ol' Cisco Flexcontroller WLC configured to handle dozens of large sites. This larger WLC does the bulk of authentications to ISE. However, we have two smaller local controllers at specific remote sites. All three WLCs are on the exact same code version, all WLC settings have been verified two or three times now. Also it is KEY to understand that all WLC large and small use the exact same policy on ISE - there are no differentiators regarding NAS-ID, NAS device or anything else. A user authenticating through the large WLC uses the same identical ISE policies as a user authenticating at one of the smaller WLC sites. You should probably disregard the fact there are three WLCs (one large, two small), Im only relaying this info to show that Splunk TA can and does parse these logs correctly, in the cases where there is no NULL value. The Splunk ISE TA app just needs to be able to handle the NULLs gracefully when/if they're encountered (which it currently does not).

As I later discovered, the parsing problem is clearly visible when comparing ISE logs of the large WLC controller with the smaller WLC's ISE logs: the normal CISE_Passed_Authentication logs for the large WLC coming from ISE are humongous, 30+ lines on Splunk and probably the full 8192 bytes maximum, compared to the logs whose parsing breaks which are only about 6 lines on Splunk's GUI. The ISE syslog size was set to the Maximum Length of 8192 months ago from the start per the documentation.

All the "short logs" get cut-off at the exact same point on the Splunk GUI, after a field called "attribute-89= ," shown here (NOTE: this is just a snippet to illustrate the issue, the ellipses ........ indicate text before/after)

AS SEEN ON SPLUNK, PARSING STOPS HERE

...........NAS-Port-Type=Wireless - IEEE 802.11, Tunnel-Type=(tag=0) VLAN, Tunnel-Medium-Type=(tag=0) 802, Tunnel-Private-Group-ID=(tag=0) 1, attribute-89= ,

SAME RAW LOG IS MUCH LARGER

..........NAS-Port-Type=Wireless - IEEE 802.11, Tunnel-Type=( tag=0) VLAN, Tunnel-Medium-Type=(tag=0) 802, Tunnel-Private-Group-ID=(tag=0) 1, attribute-89= , attribute-131=00:00:00:01, cisco-av-pair=audit-session-id=0a40016400020ebe5c5ca1e2, cisco-av-pair=mDNS=true, Airespace-Wlan-Id=100, ..........

After this field called "attribute-89= ," there is no more message on Splunk, it stops right there, and thus the rest of the parsed fields are not present on those logs - ie: Splunk correctly parses all fields up to that point, but nothing after that, as if the message ends (and it looks that way on the GUI too, no more text after that last comma).

First I had to confirm the entire ISE syslog was getting to Splunk - that it wasn't somehow being prematurely cut off by ISE before being sent to Splunk. So I configured ISE to send duplicate syslogs: one to a regular Linux server, and one to Splunk. This confirmed that the log message itself is not being cut-off by ISE - the entire log message showed up. This allowed me to validate that there is a lot more data after this "attribute-89= ,".

Moreover, searching those raw syslogs on the Linux box also helped me to confirm that at no point is there ever an empty value pair OTHER THAN with this "attribute-89". IOW, only this field comes paired with a NULL value. Which is why I've arrived at the conclusion that ISE is sending a bad value pair which causes Splunk to stop parsing right there, and disregards the rest of the large log message (guess it gets tossed or something).

I opened a case with Cisco TAC, hoping for a missed configuration somewhere. We combed through the smaller WLC's configs, then the ISE config, TAC looked at the same thing Im describing here and found nothing out of place. They're still working on it, actually. But to me it's becoming clearer that the NULL value seems to be an ISE bug/error of some sort. I suspect there is little chance Cisco is going to fix this in code anytime soon, but im fighting the good fight.

However, the reason Im here is to ask how to get the Splunk TA ISE add-on to bypass this one empty/NULL value pair to allow the ISE TA parser to continue cranking through the rest of the message and disregard any encountered NULL values. The fact the ISE TA add-on doesnt have a way to work through this is a problem itself, separate from whatever issue is leading ISE to send a NULL value for the "attribute-89" field. BTW: to be clear, I have no idea what this "attribute-89" field is nor do I care - i care about the REST OF THE JUICY DATA that comes after this value.

Ideally, the ISE TA add-on would gracefully handle this and not stop parsing right there, it would be nice if this was configured into the add-on to prevent this type of problem on other ISE syslogs. Im looking for any advice on how to get the ISE TA add-on to ignore that NULL value and move on so the rest of the log gets properly parsed. Thanks for any help.

0 Karma

mdoyle_splunk
Splunk Employee
Splunk Employee

The problem with the null field value is not just a problem for the TA, but a bigger problem with ingesting the logs over the network.

The null character causes syslog or whatever network listener you are using to mark the end of the payload, the rest of the message is considered the start of a new message.

The "Splunk" workaround is to use LINE_BREAKER and EVENT_BREAKER and set them both to something similar to:
<\d+>\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}\s+\S+\s+\S+\s+\d+\s+\d+\s+0\s

matches: <181>Dec 11 12:07:18 hostname CISE_RADIUS_Accounting 0007734788 1 0

where the first part is the syslog priority and the date/time stamp (yours could be different), but this pattern defines the start of a new event and will ignore the nulls that have been written to the file as newlines. The events will look a little ugly, but at least all the fields will be in one event.

This also will re-assemble the multi-part messages that are sent by ISE at index time.

This WILL NOT WORK if you are doing any sort of load-balancing of your syslog servers that might cause the different parts of the multi-part message to be logged to different servers, in those cases you will have to assemble the parts at search time.

0 Karma

mdoyle_splunk
Splunk Employee
Splunk Employee

The problem with the null field value is not just a problem for the TA, but a bigger problem with ingesting the logs over the network.

The null character causes syslog or whatever network listener you are using to mark the end of the payload, the rest of the message is considered the start of a new message.

The "Splunk" workaround is to use LINE_BREAKER and EVENT_BREAKER and set them both to something similar to:
<\d+>\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}\s+\S+\s+\S+\s+\d+\s+\d+\s+0\s

matches: <181>Dec 11 12:07:18 hostname CISE_RADIUS_Accounting 0007734788 1 0

where the first part is the syslog priority and the date/time stamp (yours could be different), but this pattern defines the start of a new event and will ignore the nulls that have been written to the file as newlines. The events will look a little ugly, but at least all the fields will be in one event.

0 Karma

pjasa
New Member

UPDATE:
Im running the Splunk Add-On for Cisco Identity Services (ISE) v2.2.2
I was running Splunk v6.5.5 and seeing this issue, so proceeded to upgrade to Splunk v7.2.4 and STILL the SAME PROBLEM.

Is there anyone that can help please???? It's unreal that one NULL value pair in the log ruins the whole/entire add-on for ISE -
PLEASE HELP

0 Karma

vishaltaneja070
Motivator

Hello @pjasa

I think the issue is with parsing of log. Can you please check EVENT_BREAKING in props.conf.

0 Karma

pjasa
New Member

Hi Vishal, that attribute is not defined in the /opt/splunk/etc/apps/Splunk_TA_cisco-ise/default/props.conf file, however (see below instructions from props.conf admin manual) I dont think this attribute will help handle the NULL value - plus note there's zero problems with much larger events when that value pair does not come with a NULL value (they come in fully). The issue seems to be the handling of the NULL value. Thanks.

EVENT_BREAKER_ENABLE = [true|false]
* When set to true, the UF will split incoming data with a light-weight
chunked line breaking processor so that data is distributed fairly evenly
amongst multiple indexers. Use this setting on the UF to indicate that
data should be split on event boundaries across indexers especially
for large files.
* Defaults to false.

EVENT_BREAKER =
* When set, the UF will use the setting to define an event boundary at the
end of the first matching group instance.

0 Karma
Get Updates on the Splunk Community!

Splunk Smartness with Brandon Sternfield | Episode 3

Hello and welcome to another episode of "Splunk Smartness," the interview series where we explore the power of ...

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...