I have several searches that I am trying to optimize now that our platform is on splunk 5+. My preference is to leverage report acceleration because of its ability to dynamically back-fill the way it efficiently runs in the background.
Unfortunately, several of my searches use a dedup on multiple fields (ie: dedup field1 field2 field3) and then runs timechart against one of those fields (ie: timechart span=1d field1). The use of dedup before timechart prevents report acceleration from being used as its not a streamable command.
I'm trying to find a way to eliminate to enable this search for report acceleration.
I've tried removing the dedup and playing with distinctcount(field1 field2 field3) but that failed. I also tried timechart span=1d dc(field1) by field2 field3 but that also is not allowed. I'm suspicious that I'm overlooking a trivial way to do this. Perhaps the community can enlighten me?
Fair request. I tried to abstract the company stuff so hopefully this still is clear for the community:
index=a ( sourcetype="b" OR sourcetype="c" ) ( source="/path/file1" OR source="/path/file2" ) fieldA=* ( fieldB=val1 OR fieldC=val2 )
| convert timeformat="%m/%d/%y" ctime(time) as Date1
| timechart span=1day dc(eval(fieldD.";".fieldE.";".Date1)) as countFieldName
| convert timeformat="%m/%d/%y" ctime(time) as Date
| table Date countFieldName
Notice the use of dc(eval(fieldD.";".fieldE.";".Date1)). My thought was to create a unique string from the fields I would dedup against, then get a distinct count of those. Again, I assume I'm hacking this and there's probably a more trivial approach I should be using.
The goal is to enable report acceleration on a pre-existing saved search - but the saved search was designed with dedup on several fields before the timechart command. So the folks that use the saved search want to timechart some distinct values. Is that more clear? Thanks for the clarifying questions.
I believe I found a solution: do a stats count by field1 field2 field3 where field3 is the timepan (in this case, just the day of the _time). If I'm thinking clearly, that will dedup by those three fields. Then, if I want a total count, I can do another stats count. This results in a distinct count. I believe this should be more efficient than the eval approach.