I require a table containing count of specific service compared between 2 time ranges.
table 1 (time - now)
servicename | count
table 2 (time - previous time with timerange)
servicename | count
After search expectation -
servicename | countnow| oldcount | delta
aaa 2 0 2
bbb 3 2 1
ccc 4 4 0 (added just for understanding, not expected in the actual output, as I've added delta!=0 in my query at the end)
ddd 0 2 -2
Current output with my query (Outer Join works fine but I can't see the difference value, which is the actual requirement) -
aaa 2 (null value is coming, and delta value is coming as null as well)
bbb 3 2 1
ccc 4 4 0 (Coming up in search, even though it should not!)
ddd 2 0
My query is like this -
index=* myquery earliest=-15m latest=now |stats count as CurrentCount by servicename|join type=outer servicename [search index=* myquery earliest=1578492000 latest=1578492900|stats count as OldCount by servicename]
|eval DiffValue= CurrentCount - OldCount |table servicename,OldCount,CurrentCount,DiffValue|search Diffvalue!=0
Don't want to change whole lot in the query as the table output is 1000 rows and outer join works fine. Could you please help around difference calculation part and fillnull part?
join. Try this:
index="*" AND "myquery" (earliest=-15m latest=now OR earliest=1578492000 latest=1578492000) | stats count AS debugCount count(eval(_time >= relative_time(now(), "-15m") AND time<=now())) AS CurrentCount, count(eval(_time >= 1578492000 AND time<=1578492000)) AS OldCount BY servicename | eval DiffValue= CurrentCount - OldCount | where Diffvalue!=0 | table servicename, OldCount, CurrentCount, DiffValue
@acfecondo75 trust me on Splunk Answers every community contributor's prime focus is the support for the poster. We do tend to give hints/ performance optimization tips rather than spoon feeding solutions. If you notice closely @kamlesh_vaghela's approach is different than yours. Also Kamlesh has posted his answer as a comment for @pjtbasu to try and confirm. Which means his comment can not be accepted as Answer, but your's can. However, the best solution approach is by @to4kawa which tried to avoid subsearch completely and would perform better as well with map-reduce commands.
Unlike other online forums, Splunk Answers thrives on positive sentiments hence down-voting is rarely used by community members unless the suggested solution is detrimental/irreversible for Splunk stack. Refer to the Voting etiquette docs for Splunk Answers Community participation.
Having said all these, let me up-vote both of your comments to get you on same page to keep assisting folks in need. Without our active participation and positive mindset the community will not thrive... so keep Splunking and Happy Weekend.
There has to be upvote for @to4kawa 's answer once he fixes outer join approach as that is the best way to perform such kind of correlation. While there is no
silver bullet for correlation, this is a good Splunk Documentation for Event Grouping and correlation: https://docs.splunk.com/Documentation/Splunk/latest/Search/Abouteventcorrelation
index=myquery (earliest=-15m latest=now) OR (earliest=1578492000 latest=1578492900) | eval date_label=case(_time >= relative_time(_time,"-15m") , "current" ,_time <= 1578492900, "old") | stats count(eval(date_label="current")) as CurrentCount count(eval(date_label="old")) as OldCount by servicename | eval DiffValue= CurrentCount - OldCount | where DiffValue!="0" | table servicename,OldCount,CurrentCount,DiffValue
Don't want to change whole lot in the query
I see, but your query 's
join is too slow and need
How about my recommend?
The current and past logs are searched once and tabulated for each label.
index= myquery earliest=-15m latest=now |stats count as CurrentCount by servicename
servicename,count service_A,20 service_B,21 service_C,19
Sub search result:
servicename,count service_A,17 service_C,30
I think the reason @pjtbasu used
join type=outer is to handle the case where the sub-search results in a different service name than the main search, as in the case above.
Before giving this answer, I placed about 10 service names on a trial basis in two days and counted them for each 15 minutes with a difference of 12 hours.
As a result, depending on the service name, I have confirmed that 0 was properly displayed.
sorry. I could not retry. I forget the query. but
| makeresults | eval _raw="No,value,service 1,1,A 2, ,B 3,1,C" | multikv forceheader=1 | stats count(eval(value=1)) as count by service
This result outputs "0", maybe my answer is OK.
But how about it actually? I would like @pjtbasu to check that.
In Simple words join type outer when query before join command runs for last 15min and query after join runs for specific timestamp, all the services from last 15 min must be present in the result.
I looked at the code and the question and does not seem like the two are in sync 🙂 Seems like the requirement is for
full outer join not
just outer join.
Full outer join is not available through the join command. Your solution will still work however needs to be tweaked as the requirement is not to filter out results where counts are same in current and past, but to have 0 values when the service is not available. So
| where DiffValue!=0 does not seem to be required. Rest all is fine and actually optimal.
PS: I was confused between the requirement of outer join (which in Splunk is left join) and full outer join (which seems to be requirement and handled in your code)
@to4kawa, this is the right approach to pull required data from index and use stats for correlation. However, original query had
left/outer join which means you were supposed to retain all the values from first time range which means you would need to do either
values(date_label) as uniqueLables or
dc(date_label) as uniqueLables and filter all values with uniqueLables for current and
non-zero values from old. Which means finally
| fillnull CurrentCount will be required even in your query! 🙂
I have made some changes in your search. Can you please try this?
index=myquery earliest=-15m latest=now | stats count as CurrentCount by servicename | append [ search index=myquery earliest=1578492000 latest=1578492900 | stats count as OldCount by servicename] | stats values(CurrentCount) as CurrentCount values(OldCount) as OldCount by servicename | fillnull value=0 | eval DiffValue= CurrentCount - OldCount | where DiffValue!="0" | table servicename,OldCount,CurrentCount,DiffValue
@kamlesh_vagela Your search did the job! Thank you!
Thanks @niketnilay & @acfecondo75 for your inputs and end of the day, the knowledge and willingness to help, of this community is remarkable. You guys rock!!
@pjtbasu, as stated earlier did you get to try @to4kawa 's approach of avoiding subsearch? That would be the best solution for your use case.
In the query you provided, your subsearch earliest and latest time are exactly the same value. Not sure if this was copy paste error here or if that's how it was when you ran the search. But if it's the latter, that's most likely why you're not getting a count from your join. As far as fixing the null issue, just add the following line after the subsearch but before the eval for DiffValue:
| fillnull value=0 OldCount
Then the reason the 0s are still showing is because you have a typo in the last line of your search. You wrote Diffvalue!=0 instead of DiffValue!=0 (capitalize the V so the field names match).