Splunk Search

Counting repeating fields in events

mjtung
Explorer

How do I properly count fields which repeat in each event?

Here is a sample event:

2885136 2010-10-04 03:55:35.997 1       
Run SqlCommand [GTRawRhoSelect] of Command Type [StoredProcedure]. 
Command parameters: 
      @RICPairs : type(SqlString) : [ <RICPairs xmlns=""><RICPair RIC1="ABC" RIC2="DEF" /><RICPair RIC1="ABC" RIC2="GHI" /><RICPair RIC1="GHI" RIC2="JK.LM" /><RICPair RIC1="JK.LM" RIC2="OPQ" /></RICPairs> ];
      @ValueTypeParam : type(SqlDouble) : [ Null ];

I want to identify all the distinct RICPair 's in all the logs, so above, this would be:

ABC-DEF
ABC-GHI
GHI-JK.LM
JK.LM-OPQ

and I want to show a count that each distinct pair appears. Right now, if I use the default fields that are created (eg. RIC1), it only shows the FIRST occurence in each event (so here it detects ABC as a RIC1, but not GHI).

How do I get it to count properly?

Lowell
Super Champion

Splunk can handled multi-value fields, for example RIC1=[ABC, ABC, GHI, JK.LM] and RIC2=[DEF, GHI, JK.LM, OPQ]. But the problem here is that you want to handle two sets of multivalue fields as pairs of values, which splunk doesn't really handle out of the box. So we have to look for an option other than standard mulit-value field extraction.

So one really easy and somewhat lazy approach would be to just capture a new field that contains both values, and count against that field.

First, this search should extract a multi-value field called "pair": (You should see multiple entires in the "pair" column)

... | rex "<RICPair\s+(?<pair>.+?)\s*/>" max_match=100 | table _time, pair

Now that we have a list (multi-value field) containing the pair string, it should be possible to get your distinct count:

... | rex "<RICPair\s+(?<pair>.+?)\s*/>" max_match=100 | stats dc by pair

After we have the stats like this, we can break the "pair" field apart into your two separate fields, like this:

... | rex "<RICPair\s+(?<pair>.+?)\s*/>" max_match=100 | stats dc by pair | rex field=pair "RIC1=\"(?<RIC1>.*?)\" RIC2=\"(?<RIC2>.*?)\""

Does that help at all?

mjtung
Explorer

I've been trying again, this time running fewer programs on my CPU, and it runs for a long time, with some results, but it does not finish.

I get the error: The running job "1286887054.102" was canceled remotely or expired.

I have set max_match=20000, because that is what my data logs are like: there may be up to 20000 pairs in one event.

Any ideas?

0 Karma

mjtung
Explorer

Ok I found out more....I think I might have a memory or other environment issue.

When I retried this morning, I was actually able to see a table of multiple pair entries for each event (using the | table _time, pair command). However, on a second try, I could no longer see any pairs. And when I did a count by pair, it was only detecting the first pair again.

I restarted splunk and cleaned the index, then tried again, and this time splunkd crashed while running the first command above - but I can clearly see that multiple pairs came up per event in the results.

Before I restarted I also sa

0 Karma

Lowell
Super Champion

That shouldn't be a problem in the example given. The max_match=100 should allow the regex to match multiple times and return a multi-value field called "pair" will all of the matching fields. I suppose you could try adding a mvexpand before the stats, that shouldn't be necessary, but you could try it.

0 Karma

mjtung
Explorer

Nope, I have actually tried something similar before already, but the problem with this method is it only returns the first pair (so in the above example it would only detect the ABC-DEF pair.

I have in another instance been able to detect 2 pairs in a single event, but that event had something like 30 pairs. In most other instances I am only able to detect the first pair.

I think the problem is that there could be infinite numbers of RICPairs in each event - is there another way to count it?

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...