Splunk Search

Filtering with dedup depending on number of results

timmalos
Communicator

I got a search that monitores my Netbackup jobs in real time.

search = index=Infra_NB sourcetype="NbJobs" site=$site$  (NOT HiddenByOperator="*")| fillnull value="-" Client jobCopy Policy Schedule jobFileList| dedup Client Policy Schedule jobFileList sortby -_time|dedup jobId sortby -_time  | search jobStatus>1 jobStatus!=150| sort -_time |table Type Date site Client Policy Status

I have between 2 and 20 results most time, but when there are a lot of troubles the list can grow a lot. What I would is add another dedup to the search (| dedup Client Policy Schedule sortby -_time)
only if there is more than 20 results.

I tried in this direction:

|eventstats count | eval test=if(count>20,DEDUP,nothing)

Thanks for your help,

Tags (3)
0 Karma
1 Solution

Ayn
Legend

I imagine you could achieve something like this using a combination of eventstats and streamstats.

... | eventstats count as totalcount | streamstats count as dcount by Client,Policy,Schedule | where count>20 AND dcount<2

eventstats gets the total count of events and streamstats assigns a running count for each combination of Client,Policy,Schedule. where then checks if the condition that count>20 has been met and if so it filters the events where dcount<2, that is, only 1 event per the combinatioin you want to dedup on.

View solution in original post

Ayn
Legend

I imagine you could achieve something like this using a combination of eventstats and streamstats.

... | eventstats count as totalcount | streamstats count as dcount by Client,Policy,Schedule | where count>20 AND dcount<2

eventstats gets the total count of events and streamstats assigns a running count for each combination of Client,Policy,Schedule. where then checks if the condition that count>20 has been met and if so it filters the events where dcount<2, that is, only 1 event per the combinatioin you want to dedup on.

Ayn
Legend

Ah yes, most definitely, I wrote the search off the top of my head so it was bound to have bugs in it right from the start - great that you got it working!

0 Karma

timmalos
Communicator

Thanks a lot. This is the corrected version (and I replaced where by search since not needed and as far I know its better using search when possible) :

|eventstats count|streamstats count as dcount by Client,Policy,Schedule|search (count>20 AND dcount<2) OR count<=20

0 Karma

rtadams89
Contributor

There isn't really any programmatic logic built-in to Splunk search commands to do this, but there still may be a way to accomplish your end goal.

What are you doing with the results returned? Displaying them on a custom dashboard, creating a PDF report, alerting/emailing them, ... ? Why do you want to ad the second dedup only when the results are more than 20 (and not all the time)?

If I am visualizing your data correctly, the additional dedup command should only remove events with the same client/policy/schedule but the same jobFileList. Could you instead pipe your main search to | stats dc(jobFileList) by Client, Policy, Schedule or | stats values(jobFileList) by Client, Policy, Schedule to get a more acceptable format all the time?

0 Karma

timmalos
Communicator

Thanks for your help, I used Ayn suggestion. To answer, results are returned on a custom dashboard with some custom Jquery and CSS displayed in real-time, and when there are no many errors we want all errors for one client (We use all the screen space) but when there are lot of errors I dont want to differentiate by jobFileList but display more Clients in error.

(In fact I dont even display the FileList field in the final table, but I see there are many errors for one client or many clients which have at least one error (I know I'll not have a good day in this case ^^)

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...