Splunk Search

How to avoid subsearch auto-finalize in query performing outer join against lookup table

tpflicke
Path Finder

I've got an inventory list, which greatly simplified looks like below and made it available to splunk as a lookup table.

host,os_type
m00001,linux
m00002,linux
m00003,linux

What I want to do is list the number of records against the inventory, including where the count is 0.
The query below uses an outer join and works but for anything longer than a few minutes I get

[subsearch]: Search auto-finalized after time limit (60 seconds) reached.

To be of value the count, i.e. inner query, would need to run for relatively long periods, say 1 day.

| inputlookup server_list
| fields host  
| join type=outer host [ search index=some_index | stats count by host ]
| fillnull value=0 count 

I pondered using metadata but both metadata and inputlookup need to be the first command so that seems to be a non-starter.

I can use the REST API and get the desired result by effectively doing the outer join outside Splunk but I wonder what other options exist.
I am probably not able to increase the subquery auto-finalize limit.

0 Karma
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

There are a couple of ways. First, another answer has suggested using the metadata command, which is fine as long are you're just counting by host. If you're needing a more specific query, a count by host+source, or something else, that won't help you (though in version 6.0 you can doing things like | tstats count WHERE source=xyz GROUPBY host,source very quickly using any other indexed field, or you can similarly use an accelerated data model for more complex queries).

But I would say that you can just reverse the order in general:

index=x somekeywords | stats count by host | join host [ inputlookup server_list ]

This misses out any empty hosts, so you won't have zeros, though there are workarounds to this, like:

index=x somekeywords | append [ inputlookup server_list ] | stats count by host | eval count=count-1 | join host [ inputlookup server_list ]

This is really what you want, not to run the main search in a subsearch.

But really, the better answer for the functionality you want is simply a lookup, which you can configure as an auto lookup (see props.conf and transforms.conf) or inline:

index=x somekeywords | stats count by host | lookup server_list host

or

index=x somekeywords | append [ inputlookup server_list ] | stats count by host | eval count=count-1 | lookup server_list host

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

There are a couple of ways. First, another answer has suggested using the metadata command, which is fine as long are you're just counting by host. If you're needing a more specific query, a count by host+source, or something else, that won't help you (though in version 6.0 you can doing things like | tstats count WHERE source=xyz GROUPBY host,source very quickly using any other indexed field, or you can similarly use an accelerated data model for more complex queries).

But I would say that you can just reverse the order in general:

index=x somekeywords | stats count by host | join host [ inputlookup server_list ]

This misses out any empty hosts, so you won't have zeros, though there are workarounds to this, like:

index=x somekeywords | append [ inputlookup server_list ] | stats count by host | eval count=count-1 | join host [ inputlookup server_list ]

This is really what you want, not to run the main search in a subsearch.

But really, the better answer for the functionality you want is simply a lookup, which you can configure as an auto lookup (see props.conf and transforms.conf) or inline:

index=x somekeywords | stats count by host | lookup server_list host

or

index=x somekeywords | append [ inputlookup server_list ] | stats count by host | eval count=count-1 | lookup server_list host

somesoni2
Revered Legend

I have the similar setup (server name inventory) and below query works for me just fine using metadata. (you need to add a '|' before metadata to make is first command)

|inputlookup serverInventory.csv | fields serverName | rename serverName as host | join type=outer host [|metadata type=hosts index=some_index | table host, totalCount]

tpflicke
Path Finder

Ah, didn't think of trying metadata or inputlookup as first element of a sub-query. This certainly opens up possibilities!

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...