Dashboards & Visualizations

Dedup search results only if conditional token has been set

dougburdan
Explorer

I'm developing a dashboard with 12 inputs which allows users to filter data in a variety of ways, and I'm having trouble getting the last input to work - a dropdown called 'Duplicates' with two options - 'Include' and 'Exclude'.

If we say that each day, an item can fail once, several times, or not at all . . .

  • Some users just want to know how many days in the last month that an item has failed (exclude duplicates).
  • Whereas other users want to know how many times the item failed each day (include duplicates).

If I were using conventional coding, I'd be trying to achieve something along the lines of:

If $duplicateToken$ = "Exclude" then dedup field1 field2 field3 etc.

However, I'm not sure what approach to take in Splunk, so any suggestions would be greatly appreciated.

0 Karma
1 Solution

HiroshiSatoh
Champion

You can embed in this way. how is it?

<form>
  <fieldset submitButton="false">
    <input type="radio" token="field1">
      <label>dedup</label>
      <choice value="|dedup sourcetype">yes</choice>
      <choice value="|noop">no</choice>
    </input>
  </fieldset>
  <row>
    <panel>
      <event>
        <search>
          <query>index=* |head 10 $field1$</query>
          <earliest>-60m@m</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">20</option>
        <option name="list.drilldown">none</option>
        <option name="list.wrap">1</option>
        <option name="maxLines">5</option>
        <option name="raw.drilldown">full</option>
        <option name="refresh.display">progressbar</option>
        <option name="rowNumbers">0</option>
        <option name="table.drilldown">all</option>
        <option name="table.sortDirection">asc</option>
        <option name="table.wrap">1</option>
        <option name="type">list</option>
      </event>
    </panel>
  </row>
</form>

View solution in original post

arjunpkishore5
Motivator

@dougburdan Based on the sample you posted, the answer by @HiroshiSatoh will work.

I have a slightly different approach, but at the same time it is similar to @HiroshiSatoh 's answer. Here, Instead of dedup, I'm just changing the aggregation from count to distinct count (dc) . Since the count is over date, the distinct count will always be 1 if the item exists

<input type="dropdown" token="dedupToken" searchWhenChanged="true">
      <label>Duplicates</label>
      <choice value="dc">Exclude</choice>
      <choice value="count">Include</choice>
      <initialValue>dc</initialValue>
      <default>dc</default>
    </input>

And then in your search, replace count by dedupToken

<query>
 | inputlookup alerts_CTM_all_added_fields.csv
 | eval earliest=$toearliest$
 | eval latest=if($tolatest$<=0,now(),$tolatest$)
 | where eTime >= earliest AND eTime <= latest 
 | search Source $sourceToken$ Nodeid="$nodeToken$" Application="$appToken$" Group="$groupToken$" Memname="*$jobToken$*" Message="*$messageToken$*" Cause $causeToken$ Support="$supportToken$" Essential="$essToken$"
 | sort 0 Time
 | chart $dedupToken$(Date) OVER Date BY $filterToken$
 | eval eDate=strptime(Date,"%Y-%m-%d")
 | eval wday=strftime(eDate,"%a")
 | eval day=ltrim(strftime(eDate, "%e"))
 | eval suffix=if(day=1,"st",if(day=2,"nd",if(day=3,"rd",if(day=21,"st",if(day=22,"nd",if(day=23,"rd",if(day=31,"st","th"))))))) 
 | eval month=strftime(eDate,"%b")
 | eval Date=wday+" "+day+suffix+" "+month
 | fields - eDate wday day suffix month
</query>

Give it a try and let me know if this is what you were looking for.

Cheers

0 Karma

dougburdan
Explorer

Thanks @arjunpkishore5. That code works, but doesn't have the required result - sorry, it's hard to describe in words.

(I just tried pasting some screenshots into this post, but it asked for the url of the screenshots, which I'm not sure how to do, so will have to persevere with words :o(

On the dashboard, if I select a 'Source' of ctmsp and include duplicates and leave 'Filter By' as the default of 'Source', the dashboard returns a mono-colour chart showing 100-200 alerts per day from ctmsp over the last 30 days.

If we say that 20% of alerts per day are duplicates of other alerts that same day, when I exclude duplicates I would expect the chart to show 80-160 alerts per day for the last 30 days from ctmsp.

However, using your code, when I select exclude duplicates, the chart only shows 1 alert per day - I'm guessing that the dc is counting the number of items in filterToken, so it returns 1 because I only have a single source selected?

So, sorry, not quite what I was after - but I think it's because I'm not describing it well enough.

Basically, the source CSV file has about 10 fields, and when exclude duplicates is selected, the effect should be a dedup of all the fields except for 'Time', so as to only return unique alerts for each day.

Does that make sense?

Thank you for your suggestion.

0 Karma

arjunpkishore5
Motivator

multiple approaches to this depending on what you're trying to achieve. Could you please paste a a sample query of what you're doing before the dedup?

0 Karma

dougburdan
Explorer

Thanks arjunpkishore5. Yeah good point. The question is complicated to explain, so I've posted the entire XML below.

The dashboard has 12 inputs (4 with dynamically updating dropdowns), so to help with performance I have a scheduled report appending data to a CSV file, which is what the dashboard reads.

The dedup dropdown/token is set between lines 152-158, and I would have thought that the dedup (or not) should happen after the search on line 180 but before the chart on line 182?

<form theme="light">
  <label>Batch Alerts By Source</label>
  <search>
    <query>
        | makeresults
      </query>
    <earliest>$timeToken.earliest$</earliest>
    <latest>$timeToken.latest$</latest>
    <progress>
      <eval token="toearliest">strptime($job.earliestTime$,"%Y-%m-%dT%H:%M:%S.%3N%z")</eval>
      <eval token="tolatest">strptime($job.latestTime$,"%Y-%m-%dT%H:%M:%S.%3N%z")</eval>
      <set token="jobearliest">$job.earliestTime$</set>
      <set token="joblatest">$job.latestTime$</set>
      <unset token="showTokens">"x"</unset>
    </progress>
  </search>
  <fieldset autoRun="true" submitButton="false">
    <input type="time" token="timeToken" searchWhenChanged="true">
      <label>Date Picker</label>
      <default>
        <earliest>-30d@d</earliest>
        <latest>now</latest>
      </default>
    </input>
    <input type="multiselect" token="sourceToken" searchWhenChanged="true">
      <label>Source</label>
      <choice value="ctmsd">ctmsd</choice>
      <choice value="ctmsq">ctmsq</choice>
      <choice value="ctmsp">ctmsp</choice>
      <choice value="MVSDEV">MVSDEV</choice>
      <choice value="MVSOAT">MVSOAT</choice>
      <choice value="MVSPROD">MVSPROD</choice>
      <initialValue>ctmsp,MVSPROD</initialValue>
      <delimiter> ,</delimiter>
      <prefix>IN(</prefix>
      <suffix>)</suffix>
    </input>
    <input type="dropdown" token="nodeToken" searchWhenChanged="true">
      <label>Node ID</label>
      <choice value="*">All</choice>
      <default>*</default>
      <fieldForLabel>Nodeid</fieldForLabel>
      <fieldForValue>Nodeid</fieldForValue>
      <search>
        <query>
          | inputlookup alerts_CTM_all_added_fields.csv 
          | eval earliest=$toearliest$ 
          | eval latest=if($tolatest$&lt;=0,now(),$tolatest$) 
          | where eTime &gt;= earliest AND eTime &lt;= latest
          | search Source $sourceToken$ Application="$appToken$" Group="$groupToken$" Memname="*$jobToken$*" Message="*$messageToken$*" Cause $causeToken$ Support="$supportToken$"
          | stats count by Nodeid
          | eval sorted=lower(Nodeid)
          | sort 0 sorted
        </query>
      </search>
    </input>
    <input type="dropdown" token="appToken" searchWhenChanged="true">
      <label>Application</label>
      <initialValue>All</initialValue>
      <choice value="*">All</choice>
      <default>*</default>
      <fieldForLabel>Application</fieldForLabel>
      <fieldForValue>Application</fieldForValue>
      <search>
        <query>
          | inputlookup alerts_CTM_all_added_fields.csv          
          | eval earliest=$toearliest$ 
          | eval latest=if($tolatest$&lt;=0,now(),$tolatest$) 
          | where eTime &gt;= earliest AND eTime &lt;= latest
          | search Source $sourceToken$ Nodeid="$nodeToken$" Group="$groupToken$" Memname="*$jobToken$*" Message="*$messageToken$*" Cause $causeToken$ Support="$supportToken$"
          | stats count by Application
          | eval sorted=lower(Application)
          | sort 0 sorted
        </query>
      </search>
    </input>
    <input type="dropdown" token="groupToken" searchWhenChanged="true">
      <label>Sub Application</label>
      <initialValue>*</initialValue>
      <choice value="*">All</choice>
      <default>*</default>
      <fieldForLabel>Group</fieldForLabel>
      <fieldForValue>Group</fieldForValue>
      <search>
        <query>
          | inputlookup alerts_CTM_all_added_fields.csv          
          | eval earliest=$toearliest$ 
          | eval latest=if($tolatest$&lt;=0,now(),$tolatest$) 
          | where eTime &gt;= earliest AND eTime &lt;= latest
          | search Source $sourceToken$ Nodeid="$nodeToken$" Application="$appToken$" Memname="*$jobToken$*" Message="*$messageToken$*" Cause $causeToken$ Support="$supportToken$"
          | stats count by Group
          | eval sorted=lower(Group)
          | sort 0 sorted
        </query>
      </search>
    </input>
    <input type="text" token="jobToken" searchWhenChanged="true">
      <label>Job</label>
      <initialValue>*</initialValue>
      <default>*</default>
    </input>
    <input type="text" token="messageToken" searchWhenChanged="true">
      <label>Message</label>
      <initialValue>*</initialValue>
      <default>*</default>
    </input>
    <input type="multiselect" token="causeToken" searchWhenChanged="true">
      <label>Cause</label>
      <choice value="Failed">Failed</choice>
      <choice value="Late">Late</choice>
      <choice value="Info">Info</choice>
      <initialValue>Failed,Late,Info</initialValue>
      <prefix>IN(</prefix>
      <suffix>)</suffix>
      <delimiter>,</delimiter>
    </input>
    <input type="dropdown" token="supportToken" searchWhenChanged="true">
      <label>Support Group</label>
      <initialValue>All</initialValue>
      <choice value="*">All</choice>
      <default>*</default>
      <fieldForLabel>Support</fieldForLabel>
      <fieldForValue>Support</fieldForValue>
      <search>
        <query>
          | inputlookup alerts_CTM_all_added_fields.csv          
          | eval earliest=$toearliest$ 
          | eval latest=if($tolatest$&lt;=0,now(),$tolatest$) 
          | where eTime &gt;= earliest AND eTime &lt;= latest
          | search Source $sourceToken$ Nodeid="$nodeToken$" Application="$appToken$" Group="$groupToken$" Memname="*$jobToken$*" Message="*$messageToken$*" Cause $causeToken$
          | stats count by Support
          | eval sorted=lower(Support)
          | sort 0 sorted
        </query>
      </search>
    </input>
    <input type="dropdown" token="essToken" searchWhenChanged="true">
      <label>Non-Essential Alerts</label>
      <choice value="Exclude">Exclude</choice>
      <choice value="Include">Include</choice>
      <initialValue>Exclude</initialValue>
      <default>Exclude</default>
      <change>
        <condition label="Exclude">
          <set token="essToken">Yes</set>
        </condition>
        <condition>
          <set token="essToken">*</set>
        </condition>
      </change>
    </input>
    <input type="dropdown" token="dedupToken" searchWhenChanged="true">
      <label>Duplicates</label>
      <choice value="Exclude">Exclude</choice>
      <choice value="Include">Include</choice>
      <initialValue>Exclude</initialValue>
      <default>Exclude</default>
    </input>
    <input type="dropdown" token="filterToken" searchWhenChanged="true">
      <label>Filter By</label>
      <choice value="Application">Application</choice>
      <choice value="Cause">Cause</choice>
      <choice value="Memname">Job</choice>
      <choice value="Nodeid">Node ID</choice>
      <choice value="Source">Source</choice>
      <choice value="Group">Sub Application</choice>
      <choice value="Support">Support Group</choice>
      <initialValue>Source</initialValue>
    </input>
  </fieldset>
  <row>
    <panel>
      <chart>
        <search>
          <query>
            | inputlookup alerts_CTM_all_added_fields.csv
            | eval earliest=$toearliest$
            | eval latest=if($tolatest$&lt;=0,now(),$tolatest$)
            | where eTime &gt;= earliest AND eTime &lt;= latest 
            | search Source $sourceToken$ Nodeid="$nodeToken$" Application="$appToken$" Group="$groupToken$" Memname="*$jobToken$*" Message="*$messageToken$*" Cause $causeToken$ Support="$supportToken$" Essential="$essToken$"
            | sort 0 Time
            | chart count(Date) OVER Date BY $filterToken$
            | eval eDate=strptime(Date,"%Y-%m-%d")
            | eval wday=strftime(eDate,"%a")
            | eval day=ltrim(strftime(eDate, "%e"))
            | eval suffix=if(day=1,"st",if(day=2,"nd",if(day=3,"rd",if(day=21,"st",if(day=22,"nd",if(day=23,"rd",if(day=31,"st","th"))))))) 
            | eval month=strftime(eDate,"%b")
            | eval Date=wday+" "+day+suffix+" "+month
            | fields - eDate wday day suffix month
          </query>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="charting.axisLabelsX.majorLabelStyle.overflowMode">ellipsisNone</option>
        <option name="charting.axisLabelsX.majorLabelStyle.rotation">-45</option>
        <option name="charting.axisTitleX.visibility">visible</option>
        <option name="charting.axisTitleY.visibility">visible</option>
        <option name="charting.axisTitleY2.visibility">visible</option>
        <option name="charting.axisX.abbreviation">none</option>
        <option name="charting.axisX.scale">linear</option>
        <option name="charting.axisY.abbreviation">none</option>
        <option name="charting.axisY.scale">linear</option>
        <option name="charting.axisY2.abbreviation">none</option>
        <option name="charting.axisY2.enabled">0</option>
        <option name="charting.axisY2.scale">inherit</option>
        <option name="charting.chart">column</option>
        <option name="charting.chart.bubbleMaximumSize">50</option>
        <option name="charting.chart.bubbleMinimumSize">10</option>
        <option name="charting.chart.bubbleSizeBy">area</option>
        <option name="charting.chart.nullValueMode">gaps</option>
        <option name="charting.chart.showDataLabels">none</option>
        <option name="charting.chart.sliceCollapsingThreshold">0.01</option>
        <option name="charting.chart.stackMode">stacked</option>
        <option name="charting.chart.style">shiny</option>
        <option name="charting.drilldown">none</option>
        <option name="charting.layout.splitSeries">0</option>
        <option name="charting.layout.splitSeries.allowIndependentYRanges">0</option>
        <option name="charting.legend.labelStyle.overflowMode">ellipsisMiddle</option>
        <option name="charting.legend.mode">standard</option>
        <option name="charting.legend.placement">right</option>
        <option name="charting.lineWidth">2</option>
        <option name="trellis.enabled">0</option>
        <option name="trellis.scales.shared">1</option>
        <option name="trellis.size">medium</option>
      </chart>
    </panel>
  </row>
</form>
0 Karma

bowesmana
SplunkTrust
SplunkTrust

@HiroshiSatoh post below is the easiest, i.e. just make a token with the full dedup command and put that in the search

0 Karma

dougburdan
Explorer

(sorry, not sure why lines numbers didn't appear in my previous post)

0 Karma

HiroshiSatoh
Champion

You can embed in this way. how is it?

<form>
  <fieldset submitButton="false">
    <input type="radio" token="field1">
      <label>dedup</label>
      <choice value="|dedup sourcetype">yes</choice>
      <choice value="|noop">no</choice>
    </input>
  </fieldset>
  <row>
    <panel>
      <event>
        <search>
          <query>index=* |head 10 $field1$</query>
          <earliest>-60m@m</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="count">20</option>
        <option name="list.drilldown">none</option>
        <option name="list.wrap">1</option>
        <option name="maxLines">5</option>
        <option name="raw.drilldown">full</option>
        <option name="refresh.display">progressbar</option>
        <option name="rowNumbers">0</option>
        <option name="table.drilldown">all</option>
        <option name="table.sortDirection">asc</option>
        <option name="table.wrap">1</option>
        <option name="type">list</option>
      </event>
    </panel>
  </row>
</form>

dougburdan
Explorer

Have just managed to get it to work.

This is the code for the input:

<input type="dropdown" token="dedupToken" searchWhenChanged="true">
  <label>Duplicates</label>
  <choice value="| dedup Source Memname Orderid Message">Exclude</choice>
  <choice value="| noop">Include</choice>
  <default>| noop</default>
  <initialValue>| noop</initialValue>
</input>

And this is the updated search code for the chart with the dedup token appended to the search:

    <search>
      <query>
        | inputlookup alerts_CTM_all_added_fields.csv
        | eval earliest=$toearliest$
        | eval latest=if($tolatest$&lt;=0,now(),$tolatest$)
        | where eTime &gt;= earliest AND eTime &lt;= latest 
        | search Source $sourceToken$ Nodeid="$nodeToken$" Application="$appToken$" Group="$groupToken$" Memname="*$jobToken$*" Message="*$messageToken$*" Cause $causeToken$ Support="$supportToken$" Essential="$essToken$" $dedupToken$
        | sort 0 Time
        | chart count(Date) OVER Date BY $filterToken$
        | eval eDate=strptime(Date,"%Y-%m-%d")
        | eval wday=strftime(eDate,"%a")
        | eval day=ltrim(strftime(eDate, "%e"))
        | eval suffix=if(day=1,"st",if(day=2,"nd",if(day=3,"rd",if(day=21,"st",if(day=22,"nd",if(day=23,"rd",if(day=31,"st","th"))))))) 
        | eval month=strftime(eDate,"%b")
        | eval Date=wday+" "+day+suffix+" "+month
        | fields - eDate wday day suffix month
      </query>
      <sampleRatio>1</sampleRatio>
    </search>

Thanks to everyone for their help and input!

0 Karma

dougburdan
Explorer

Thanks HiroshiSatoh. I hadn't thought of noop, I think this is a really good idea.

The source of the dashboard is a CSV file which is regularly appended to by a scheduled report - I do this to make the dashboard run faster.

So where you mention deduping the sourcetype and storing it in a token, this would mean that the token would 'contain' several fields (the CSV source has 12 columns) and thousands of rows of data? I am new to Splunk but that sounds too powerful.

I also apologise for not mentioning the CSV source file in my original question.

If it helps, I have just posted the full XML code for the dashboard in a comment under my original question.

0 Karma
Get Updates on the Splunk Community!

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...