Splunk Search

Fillnull in outer join

pjtbasu
Explorer

Hi,

I require a table containing count of specific service compared between 2 time ranges.

table 1 (time - now)
servicename | count
aaa 2
bbb 3
ccc 4

table 2 (time - previous time with timerange)
servicename | count
bbb 2
ddd 2
ccc 4

After search expectation -

servicename | countnow| oldcount | delta
aaa 2 0 2
bbb 3 2 1
ccc 4 4 0 (added just for understanding, not expected in the actual output, as I've added delta!=0 in my query at the end)
ddd 0 2 -2

Current output with my query (Outer Join works fine but I can't see the difference value, which is the actual requirement) -
aaa 2 (null value is coming, and delta value is coming as null as well)
bbb 3 2 1
ccc 4 4 0 (Coming up in search, even though it should not!)
ddd 2 0

My query is like this -
index=* myquery earliest=-15m latest=now |stats count as CurrentCount by servicename|join type=outer servicename [search index=* myquery earliest=1578492000 latest=1578492900|stats count as OldCount by servicename]
|eval DiffValue= CurrentCount - OldCount |table servicename,OldCount,CurrentCount,DiffValue|search Diffvalue!=0

Don't want to change whole lot in the query as the table output is 1000 rows and outer join works fine. Could you please help around difference calculation part and fillnull part?

Tags (3)
0 Karma

woodcock
Esteemed Legend

NEVER USE join. Try this:

index="*" AND "myquery" (earliest=-15m latest=now OR earliest=1578492000 latest=1578492000)
| stats count AS debugCount count(eval(_time >= relative_time(now(), "-15m") AND time<=now())) AS CurrentCount, count(eval(_time >= 1578492000 AND time<=1578492000)) AS OldCount BY servicename
| eval DiffValue= CurrentCount - OldCount 
| where Diffvalue!=0
| table servicename, OldCount, CurrentCount, DiffValue
0 Karma

niketn
Legend

@acfecondo75 trust me on Splunk Answers every community contributor's prime focus is the support for the poster. We do tend to give hints/ performance optimization tips rather than spoon feeding solutions. If you notice closely @kamlesh_vaghela's approach is different than yours. Also Kamlesh has posted his answer as a comment for @pjtbasu to try and confirm. Which means his comment can not be accepted as Answer, but your's can. However, the best solution approach is by @to4kawa which tried to avoid subsearch completely and would perform better as well with map-reduce commands.

Unlike other online forums, Splunk Answers thrives on positive sentiments hence down-voting is rarely used by community members unless the suggested solution is detrimental/irreversible for Splunk stack. Refer to the Voting etiquette docs for Splunk Answers Community participation.

Having said all these, let me up-vote both of your comments to get you on same page to keep assisting folks in need. Without our active participation and positive mindset the community will not thrive... so keep Splunking and Happy Weekend.

There has to be upvote for @to4kawa 's answer once he fixes outer join approach as that is the best way to perform such kind of correlation. While there is no silver bullet for correlation, this is a good Splunk Documentation for Event Grouping and correlation: https://docs.splunk.com/Documentation/Splunk/latest/Search/Abouteventcorrelation

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

acfecondo75
Path Finder

after reading your response, I agree that i was hasty to downvote. I switched to an upvote and removed the comment.

to4kawa
Ultra Champion
index=myquery (earliest=-15m latest=now) OR (earliest=1578492000 latest=1578492900)
| eval date_label=case(_time >= relative_time(_time,"-15m") , "current" ,_time <= 1578492900, "old")
| stats count(eval(date_label="current")) as CurrentCount count(eval(date_label="old")) as OldCount by servicename 
| eval DiffValue= CurrentCount - OldCount 
| where DiffValue!="0"
| table servicename,OldCount,CurrentCount,DiffValue

Hi, @pjtbasu
your request:

Don't want to change whole lot in the query

I see, but your query 's join is too slow and need fillnull .
How about my recommend?
The current and past logs are searched once and tabulated for each label.
It's fast.

to4kawa
Ultra Champion

Main search:

index= myquery earliest=-15m latest=now 
|stats count as CurrentCount by servicename

Result:

servicename,count
service_A,20
service_B,21
service_C,19

Sub search result:

 servicename,count
 service_A,17
 service_C,30

I think the reason @pjtbasu used join type=outer is to handle the case where the sub-search results in a different service name than the main search, as in the case above.

Before giving this answer, I placed about 10 service names on a trial basis in two days and counted them for each 15 minutes with a difference of 12 hours.

As a result, depending on the service name, I have confirmed that 0 was properly displayed.
sorry. I could not retry. I forget the query. but

| makeresults 
| eval _raw="No,value,service
1,1,A
2, ,B
3,1,C"
| multikv forceheader=1
| stats count(eval(value=1)) as count by service

This result outputs "0", maybe my answer is OK.
But how about it actually? I would like @pjtbasu to check that.

0 Karma

niketn
Legend

In Simple words join type outer when query before join command runs for last 15min and query after join runs for specific timestamp, all the services from last 15 min must be present in the result.

I looked at the code and the question and does not seem like the two are in sync 🙂 Seems like the requirement is for full outer join not just outer join. Full outer join is not available through the join command. Your solution will still work however needs to be tweaked as the requirement is not to filter out results where counts are same in current and past, but to have 0 values when the service is not available. So | where DiffValue!=0 does not seem to be required. Rest all is fine and actually optimal.

PS: I was confused between the requirement of outer join (which in Splunk is left join) and full outer join (which seems to be requirement and handled in your code)

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

to4kawa
Ultra Champion

full outer join , I haven't know till now. I see.
Thanks @niketnilay

0 Karma

niketn
Legend

@to4kawa, this is the right approach to pull required data from index and use stats for correlation. However, original query had left/outer join which means you were supposed to retain all the values from first time range which means you would need to do either values(date_label) as uniqueLables or dc(date_label) as uniqueLables and filter all values with uniqueLables for current and non-zero values from old. Which means finally | fillnull CurrentCount will be required even in your query! 🙂

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@pjtbasu

I have made some changes in your search. Can you please try this?

index=myquery earliest=-15m latest=now 
| stats count as CurrentCount by servicename 
| append 
    [ search index=myquery earliest=1578492000 latest=1578492900 
    | stats count as OldCount by servicename] 
| stats values(CurrentCount) as CurrentCount values(OldCount) as OldCount by servicename
| fillnull value=0
| eval DiffValue= CurrentCount - OldCount 
| where DiffValue!="0"
| table servicename,OldCount,CurrentCount,DiffValue
0 Karma

pjtbasu
Explorer

@kamlesh_vagela Your search did the job! Thank you!

Thanks @niketnilay & @acfecondo75 for your inputs and end of the day, the knowledge and willingness to help, of this community is remarkable. You guys rock!!

0 Karma

niketn
Legend

@pjtbasu, as stated earlier did you get to try @to4kawa 's approach of avoiding subsearch? That would be the best solution for your use case.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Yes @pjtbasu 🙂 you are right.. please accept and close it.

0 Karma

acfecondo75
Path Finder

Hi @pjtbasu,

In the query you provided, your subsearch earliest and latest time are exactly the same value. Not sure if this was copy paste error here or if that's how it was when you ran the search. But if it's the latter, that's most likely why you're not getting a count from your join. As far as fixing the null issue, just add the following line after the subsearch but before the eval for DiffValue:

| fillnull value=0 OldCount

Then the reason the 0s are still showing is because you have a typo in the last line of your search. You wrote Diffvalue!=0 instead of DiffValue!=0 (capitalize the V so the field names match).

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...