Getting Data In

Cisco TCP Syslogs - Close Events Not Split

iunderwood
Path Finder

I've got a dev box that I'm running an instance of Splunk on and one of the things I am testing is the feasibility of using TCP for syslog. Numerous folks and articles suggest TCP is definitely the way to go, and for many reasons, this is a sensible choice. However, there are two items I cannot get to work as I expect which are in the way.

  • Close events do not split.

Here is an entry where the events are so rapid-fire that they don't split out into individual events of their own:

<188>640624: 640621: Aug 9 2012 00:12:09.428 UTC: %CRYPTO-4-RECVD_PKT_MAC_ERR: decrypt: mac verify failed for connection id=2001 local=x.x.x.x remote=y.y.y.y spi=E78EE4D2 seqno=000022E9<188>640625: 640622: Aug 9 2012 00:12:14.012 UTC: %IPS-4-SIGNATURE: Sig:2004 Subsig:0 Sev:2 ICMP Echo Req [z.z.z.z:0 -> a.a.a.a:0]

I expect this could be taken care of in props.conf with something like this:

[source::tcp:5170]
BREAK_ONLY_BEFORE = (\<\d+\>\d)+

I have also tried: \<\d{1,3}\>

I figure there needs to be something simple I am missing in my RegEx for this statement.

  • TCP logging sometimes stops.

Every now and then, one of my remote sites will undergo carrier maintenance and the logging will be unavailable for some time and logging doesn't ever seem to recover. In a "show log", the output looks similar to this:

Trap logging: level informational, 6360 message lines logged
    Logging to 192.168.16.10  (tcp port 5140, audit disabled,
          link down),

The quick way I have found to restart this is to remove and add my log host. This isn't really practical either. However, I haven't found out where to tell this to keep retrying after a certain amount of time. This isn't specifically a Splunk thing, but I'm hoping someone out there has already run across this.

Thanks in advance for the assistance!

++I;

1 Solution

iunderwood
Path Finder

The LINE_BREAKER hint put me in the right direction with the first part of the question.

LINE_BREAKER = ([^\<])\<\d{1,3}\>

Essentially the selector in parenthesis is anything that's NOT a <, followed by the expression, so it won't break on the first line, but it will on subsequent ones. I have this working in my development environment and it's working well.

As far as the TCP restarting goes, I've asked that on the Cisco forum. I'll update this answer with that information depending on what I find there.

View solution in original post

iunderwood
Path Finder

The LINE_BREAKER hint put me in the right direction with the first part of the question.

LINE_BREAKER = ([^\<])\<\d{1,3}\>

Essentially the selector in parenthesis is anything that's NOT a <, followed by the expression, so it won't break on the first line, but it will on subsequent ones. I have this working in my development environment and it's working well.

As far as the TCP restarting goes, I've asked that on the Cisco forum. I'll update this answer with that information depending on what I find there.

View solution in original post

iunderwood
Path Finder

This seems to affect IOS 12.2, at least on all my switching platforms. All my routers run 12.4 or higher and recover w/o incident. I now have a case open with TAC.

0 Karma

dwaddle
SplunkTrust
SplunkTrust

We solved this problem somewhat unconventionally, but it works. I wrote a C program (I'll see if I can post to splunkbase - need to make sure the bosses approve) that receives the TCP data from Cisco ASA and spools it to a file which Splunk then reads. I assume it would work with some adjustment with IOS based devices, but I'm not sure. It filters the <xxx> part out entirely, even handling a special case where the ASA logs a continuation line that starts with a space.

On the ASA, each line is \n terminated. If this is true with IOS, something like this may work:

LINE_BREAKER=([\r\n]+<\d{1-3}>)

(Note I've not tested the above)

It is quite a pain when Cisco devices disable TCP syslog output entirely. We deal with this via a Nagios check that looks for an ESTABLISHED socket in netstat for each firewall we know should be there. When one isn't there, we alarm on it and work the issue accordingly. With the *nix app, you could probably accomplish this same check without a lot of effort as a search+alert.