I am trying to join two searches together to table the combined results by host.
First search below is showing number of events in the last hour by host, index, and sourcetype:
| tstats count where index=* by host, index, sourcetype | addtotals | sort -Total | fields - Total | rename count as events_latest_hour
Second search is showing the ingest per hour in GB by host.
(index=_internal host=splunk_shc source=*license_usage.log* type=Usage)
| stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2) | rename h as host, Usage as usage_lastest_hour | addtotals | sort -Total | fields - Total
Can you please help with how i would join these two searches together to display the host, index, sourcetype, events_latest_hour, usage_lastest_hour
Basically i want to table the results of the first search and also include the results "usage_lastest_hour"from the second search into the table.
As @PickleRick said, your question is unclear on a key element: Desired output because your first search groups by three fields (host, index, and sourcetype) whereas the last search can only give one of the three (host). Does this mean you want to give each host the same usage_lastest_hour no matter which index or sourcetype the first search output come from? In that case, you can do something like
| tstats count where index=* by host, index, sourcetype
| append
[search (index=_internal host=splunk_shc source=*license_usage.log* type=Usage)
| stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2)
| rename h as host, Usage as usage_lastest_hour]
| stats values(count) as events_latest_hour values(usage_lastest_hour) as usage_lastest_hour by host, index, sourcetype
| sort - events_latest_hour, usage_lastest_hour
Note:
As @PickleRick said, your question is unclear on a key element: Desired output because your first search groups by three fields (host, index, and sourcetype) whereas the last search can only give one of the three (host). Does this mean you want to give each host the same usage_lastest_hour no matter which index or sourcetype the first search output come from? In that case, you can do something like
| tstats count where index=* by host, index, sourcetype
| append
[search (index=_internal host=splunk_shc source=*license_usage.log* type=Usage)
| stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2)
| rename h as host, Usage as usage_lastest_hour]
| stats values(count) as events_latest_hour values(usage_lastest_hour) as usage_lastest_hour by host, index, sourcetype
| sort - events_latest_hour, usage_lastest_hour
Note:
My additional two cents on that - This combined search is using a subsearch results from which are appended to the results of "main" initial search. You have to understand limitations of subsearches. They have limits for returned results (which you might not hit here as you're summarizing the data with stats so you'd be returning just a bunch of rows probably) and - more importantly - execution time.
It's important because if your subsearch runs for too long it gets finalized silently which means that only values calculated so far are returned to the outer search and you have no indication whatsoever that the subsearch wasn't allowed to run to its natural end. So in the end you might get no results/incomplete results/wrong results and not be aware of it.
Therefore it's advisable to:
1. Keep the searches short (meaning not searching through a lot of data)
2. If possible, use indexed fields (like with the tstats command)
3. If you have two searches which significantly differ in terms of number of results and execution time, use the small/short one as the appended/joined/whatever subsearch.
So in case of this particular scenario I'd swap the initial raw data search with tstats to lower the probability of the whole search "running away".
index=_internal host=splunk_shc source=*license_usage.log* type=Usage
| stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2)
| rename h as host, Usage as usage_lastest_hour
| append [
| tstats count where index=* by host, index, sourcetype ]
| stats values(count) as events_latest_hour values(usage_lastest_hour) as usage_lastest_hour by host, index, sourcetype
| sort - events_latest_hour, usage_lastest_hour
This is exactly what i am looking for - however for some reason i am not getting any values for the field "usage_lastest_hour" - any idea why this field is not displaying results? All the others are displaying as expected with the search you provided.
Got it to work - thank you
Unfortunately, those searches are of different types (one starts with a streaming search command, the other with report-generating tstats command) which means you can't combine them into one search and process combined results or use multisearch to run both in parallel.
You're limited to either using join command as you attempted or appending one results set to another and then do some summarizing stats.
Having said that - I don't quite get how you imagine your desired output since the tstats will split results by three fields whereas your raw index search returns stats split only by host.
Not able to get that to work
Are you able to use the join command based on host?
<search 1>
| join host [<search 2>]