Splunk Search

Count of events by field where field has multiple instances of same value

djohnson99
Explorer

Hi there 

We presently have a setup where error codes are extracted and put into their own field.  The way it's been setup is that we extract all instances where it matches a certain regex pattern (e.g. "ERROR-" -> ERROR-1234, ERROR-5869 etc) meaning as the error code is repeated in the event we get multiple instances of the same item in the field (e.g. "ErrorField" = ERROR-1234, ERROR-1234, ERROR-5869).

Doing 'stats count by ErrorField' seems to return all items, even the ones that are repeated as we'd get ERROR-1234 = 700 on the stats count, but a simple search where ErrorField=ERROR-1234 returns say 300 events only.

I've tried to come up with a way to filter out the duplicates in the search but have so far come up short.  My two possibles are creating a table of field instances, and then using that to launch a second search, or performing a sub search to capture the unique fields that are then used to count the events by field.  What I have so far is as follows:

index=index ErrorField="*"
| stats count by ErrorField
| replace ERROR-X-* with ERROR-X- in ErrorField
| where count > 200 AND ErrorField!="ERROR-X-"
| table ErrorField

In short I've got a search that counts, then does a replace/where to weed out values below a certain threshold/don't match a certain criteria then putting what remains in a table.  It's the results from this table I want to use in the next search and count the total events for each item in that table, but have so far failed to be able to do this.

Is this possible to do, and is there a right way to do this, either continuing with the method above or using a subsearch?  Alternately is it possible to remove duplicates in the original field extraction so this isn't necessary?  Although that option may not be possible as the field extraction isn't handled by ourselves and I can't say too much about it.

Thank you!

Labels (1)
Tags (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

Are you sure that these are duplicates

If your ErrorField is a multivalue field then doing the stats count by will count by each of the values of the MV field

So, it's simple to remove duplicates with 

| eval ErrorField=mvdedup(ErrorField)

 If the ErrorField is not multivalue (I am not sure how you would get your results if it is a simple string with comma separated values) then you could do

| eval ErrorField=mvdedup(split(ErrorField,","))

Hope this helps

 

View solution in original post

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Are you sure that these are duplicates

If your ErrorField is a multivalue field then doing the stats count by will count by each of the values of the MV field

So, it's simple to remove duplicates with 

| eval ErrorField=mvdedup(ErrorField)

 If the ErrorField is not multivalue (I am not sure how you would get your results if it is a simple string with comma separated values) then you could do

| eval ErrorField=mvdedup(split(ErrorField,","))

Hope this helps

 

0 Karma

djohnson99
Explorer

That is tremendously simple and I can't believe that was it needed.  Thank you for this!  

The final query is now:

index=index ErrorField="*"
| eval ErrorField=mvdedup(ErrorField)
| stats count by ErrorField
| replace ERROR-X-* with ERROR-X- in ErrorField
| where count > 200 AND ErrorField!="ERROR-X-"

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...