Splunk Search

Join type and extra data?

hollybross1219
Path Finder

Here is my query (time range is YTD):

(splunk_server=indexer* index=wsi_tax_summary sourcetype=stash capability=109* tax_year=2019 ein=* intuit_offeringid=* 
partnerId!=*test* partnerId=*) 
| timechart span=1d dc(intuit_tid) as 19attempts 
| streamstats sum(19attempts) as 19attempts
| eval time=strftime(_time,"%m-%d") 
| join type=left time 
   [ inputlookup TY18_Splunk_total_data.csv 
    | where capability="109X" 
    | stats sum(attempts) as 18attempts by _time 
    | streamstats sum(18attempts) as 18attempts
    | eval time=strftime(strptime(_time,"%m/%d/%Y"), "%m-%d")
    | fields time 18attempts]
| fields time 19attempts 18attempts
| rename 19attempts as "TY19"
| rename 18attempts as "TY18"

I understand a left join to mean that if the results from my subsearch don't match with the main search, it won't be included. If I run the query above, I get data in TY18 column from 01-02 thru 01-09 (below).

alt text

I didn't expect data against those dates, so I copied the subsearch and ran it in a separate search window, and I can see (as I expected) there's no data from 01-02 thru 01-09 (below).
alt text

Am I not understanding something about join type? What's happening here?

0 Karma

to4kawa
Ultra Champion

different year's %m-%d values is populated. maybe 2020/01/02 to 2020/01/09

0 Karma

hollybross1219
Path Finder

So I think it might be a problem with the data in the lookup, but I don't understand what's wrong. When I do this basic query:

| inputlookup TY18_Splunk_total_data.csv 
| where capability="109X"

I can see data against dates 1/2/2019 - 1/9/2019
alt text

Aggregating this data, I lose the data against these dates for some reason, and I don't understand why. I ran this query which generated these results. I'm at a loss...

| inputlookup TY18_Splunk_total_data.csv 
| where capability="109X"
| stats sum(attempts) by _time

alt text

0 Karma

to4kawa
Ultra Champion

| inputlookup TY18_Splunk_total_data.csv
| where capability="109X"
| stats sum(attempts) by _time
| sort 0 _time

try and check the result.

and when you usestreamstats , you should sort it.
Splunk fields is ascii order(1,10,2,20 ... ) and result order same at sometime.

0 Karma

kmugglet
Communicator

Hi,

You should be able to get the same results without using join

(splunk_server=indexer* index=wsi_tax_summary sourcetype=stash capability=109* tax_year=2019 ein=* intuit_offeringid=* 
    partnerId!=*test* partnerId=*) 
| timechart span=1d dc(intuit_tid) as 19attempts 
| streamstats sum(19attempts) as 19attempts 
| eval time=strftime(_time,"%m-%d") 
| append 
    [ inputlookup TY18_Splunk_total_data.csv 
    | where capability="109X" 
    | stats sum(attempts) as 18attempts by _time 
    | streamstats sum(18attempts) as 18attempts 
    | eval time=strftime(strptime(_time,"%m/%d/%Y"), "%m-%d") 
    | fields time 18attempts] 
| stats values(19attempts) AS TY19, values(18attempts) AS TY18 by time 

Cheers, Keith

0 Karma

hollybross1219
Path Finder

Hi @kmugglet, thanks for the suggestion. I tried this and it didn't work :(. Join is what I'm looking for because I'm comparing days (represented by the row) across different years (represented by the two columns).

0 Karma

kmugglet
Communicator

Also you should be able to replace this line

[ inputlookup TY18_Splunk_total_data.csv WHERE capability=="109X" 

Sometimes it works, sometimes it doesn't - Yay Splunk

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...