- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Join type and extra data?
Here is my query (time range is YTD):
(splunk_server=indexer* index=wsi_tax_summary sourcetype=stash capability=109* tax_year=2019 ein=* intuit_offeringid=*
partnerId!=*test* partnerId=*)
| timechart span=1d dc(intuit_tid) as 19attempts
| streamstats sum(19attempts) as 19attempts
| eval time=strftime(_time,"%m-%d")
| join type=left time
[ inputlookup TY18_Splunk_total_data.csv
| where capability="109X"
| stats sum(attempts) as 18attempts by _time
| streamstats sum(18attempts) as 18attempts
| eval time=strftime(strptime(_time,"%m/%d/%Y"), "%m-%d")
| fields time 18attempts]
| fields time 19attempts 18attempts
| rename 19attempts as "TY19"
| rename 18attempts as "TY18"
I understand a left join to mean that if the results from my subsearch don't match with the main search, it won't be included. If I run the query above, I get data in TY18 column from 01-02 thru 01-09 (below).
I didn't expect data against those dates, so I copied the subsearch and ran it in a separate search window, and I can see (as I expected) there's no data from 01-02 thru 01-09 (below).
Am I not understanding something about join type? What's happening here?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
different year's %m-%d
values is populated. maybe 2020/01/02 to 2020/01/09
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So I think it might be a problem with the data in the lookup
, but I don't understand what's wrong. When I do this basic query:
| inputlookup TY18_Splunk_total_data.csv
| where capability="109X"
I can see data against dates 1/2/2019 - 1/9/2019
Aggregating this data, I lose the data against these dates for some reason, and I don't understand why. I ran this query which generated these results. I'm at a loss...
| inputlookup TY18_Splunk_total_data.csv
| where capability="109X"
| stats sum(attempts) by _time
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
| inputlookup TY18_Splunk_total_data.csv
| where capability="109X"
| stats sum(attempts) by _time
| sort 0 _time
try and check the result.
and when you usestreamstats
, you should sort it.
Splunk fields is ascii order(1,10,2,20 ... ) and result order same at sometime.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You should be able to get the same results without using join
(splunk_server=indexer* index=wsi_tax_summary sourcetype=stash capability=109* tax_year=2019 ein=* intuit_offeringid=*
partnerId!=*test* partnerId=*)
| timechart span=1d dc(intuit_tid) as 19attempts
| streamstats sum(19attempts) as 19attempts
| eval time=strftime(_time,"%m-%d")
| append
[ inputlookup TY18_Splunk_total_data.csv
| where capability="109X"
| stats sum(attempts) as 18attempts by _time
| streamstats sum(18attempts) as 18attempts
| eval time=strftime(strptime(_time,"%m/%d/%Y"), "%m-%d")
| fields time 18attempts]
| stats values(19attempts) AS TY19, values(18attempts) AS TY18 by time
Cheers, Keith
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @kmugglet, thanks for the suggestion. I tried this and it didn't work :(. Join is what I'm looking for because I'm comparing days (represented by the row) across different years (represented by the two columns).
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also you should be able to replace this line
[ inputlookup TY18_Splunk_total_data.csv WHERE capability=="109X"
Sometimes it works, sometimes it doesn't - Yay Splunk
