Splunk Search

Join two searches together by host

scout29
Path Finder

I am trying to join two searches together to table the combined results by host.

First search below is showing number of events in the last hour by host, index, and sourcetype:

| tstats count where index=* by host, index, sourcetype | addtotals | sort -Total | fields - Total | rename count as events_latest_hour


Second search is showing the ingest per hour in GB by host. 

(index=_internal host=splunk_shc source=*license_usage.log* type=Usage)
| stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2) | rename h as host, Usage as usage_lastest_hour | addtotals | sort -Total | fields - Total



Can you please help with how i would join these two searches together to display the host, index, sourcetype, events_latest_hour,  usage_lastest_hour

Basically i want to table the results of the first search and also include the results "usage_lastest_hour"from the second search into the table. 

 

0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

As @PickleRick said, your question is unclear on a key element: Desired output because your first search groups by three fields (host, index, and sourcetype) whereas the last search can only give one of the three (host).  Does this mean you want to give each host the same usage_lastest_hour no matter which index or sourcetype the first search output come from?  In that case, you can do something like

 

 

| tstats count where index=* by host, index, sourcetype
| append
    [search (index=_internal host=splunk_shc source=*license_usage.log* type=Usage)
    | stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2)
    | rename h as host, Usage as usage_lastest_hour]
| stats values(count) as events_latest_hour values(usage_lastest_hour) as usage_lastest_hour by host, index, sourcetype
| sort - events_latest_hour, usage_lastest_hour

 

 

Note:

  1. There can only be one primary sort order.  I choose events_latest_hour as it appears to be the most logical.
  2. The addtotal command does nothing in either search; Total value is identical to the singular numeric field in each.  So, I scraped it.

View solution in original post

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

As @PickleRick said, your question is unclear on a key element: Desired output because your first search groups by three fields (host, index, and sourcetype) whereas the last search can only give one of the three (host).  Does this mean you want to give each host the same usage_lastest_hour no matter which index or sourcetype the first search output come from?  In that case, you can do something like

 

 

| tstats count where index=* by host, index, sourcetype
| append
    [search (index=_internal host=splunk_shc source=*license_usage.log* type=Usage)
    | stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2)
    | rename h as host, Usage as usage_lastest_hour]
| stats values(count) as events_latest_hour values(usage_lastest_hour) as usage_lastest_hour by host, index, sourcetype
| sort - events_latest_hour, usage_lastest_hour

 

 

Note:

  1. There can only be one primary sort order.  I choose events_latest_hour as it appears to be the most logical.
  2. The addtotal command does nothing in either search; Total value is identical to the singular numeric field in each.  So, I scraped it.
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

My additional two cents on that - This combined search is using a subsearch results from which are appended to the results of "main" initial search. You have to understand limitations of subsearches. They have limits for returned results (which you might not hit here as you're summarizing the data with stats so you'd be returning just a bunch of rows probably) and - more importantly - execution time.

It's important because if your subsearch runs for too long it gets finalized silently which means that only values calculated so far are returned to the outer search and you have no indication whatsoever that the subsearch wasn't allowed to run to its natural end. So in the end you might get no results/incomplete results/wrong results and not be aware of it.

Therefore it's advisable to:

1. Keep the searches short (meaning not searching through a lot of data)

2. If possible, use indexed fields (like with the tstats command)

3. If you have two searches which significantly differ in terms of number of results and execution time, use the small/short one as the appended/joined/whatever subsearch.

So in case of this particular scenario I'd swap the initial raw data search with tstats to lower the probability of the whole search "running away".

index=_internal host=splunk_shc source=*license_usage.log* type=Usage
| stats sum(b) as Usage by h | eval Usage=round(Usage/1024/1024/1024,2)
| rename h as host, Usage as usage_lastest_hour 
   | append [
      | tstats count where index=* by host, index, sourcetype ]
| stats values(count) as events_latest_hour values(usage_lastest_hour) as usage_lastest_hour by host, index, sourcetype
| sort - events_latest_hour, usage_lastest_hour

scout29
Path Finder

This is exactly what i am looking for - however for some reason i am not getting any values for the field "usage_lastest_hour"  - any idea why this field is not displaying results? All the others are displaying as expected with the search you provided.

0 Karma

scout29
Path Finder

Got it to work - thank you

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Unfortunately, those searches are of different types (one starts with a streaming search command, the other with report-generating tstats command) which means you can't combine them into one search and process combined results or use multisearch to run both in parallel.

You're limited to either using join command as you attempted or appending one results set to another and then do some summarizing stats.

Having said that - I don't quite get how you imagine your desired output since the tstats will split results by three fields whereas your raw index search returns stats split only by host.

scout29
Path Finder

Not able to get that to work

0 Karma

marnall
Motivator

Are you able to use the join command based on host?

<search 1>
| join host [<search 2>]

0 Karma
Get Updates on the Splunk Community!

Monitoring MariaDB and MySQL

In a previous post, we explored monitoring PostgreSQL and general best practices around which metrics to ...

Financial Services Industry Use Cases, ITSI Best Practices, and More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Splunk Federated Analytics for Amazon Security Lake

Thursday, November 21, 2024  |  11AM PT / 2PM ET Register Now Join our session to see the technical ...