Getting Data In

How can I find hosts from a CSV that are and aren't returning events to Splunk?

Builder

I've been poking around Splunk Answers for a while today and can't quite match the scenario I've got. I considered https://answers.splunk.com/answers/176466/how-to-use-eval-if-there-is-no-result-from-the-bas-1.html but couldn't quite make it work for my needs.

I have a CSV that contains a list of hostnames which works fine as a lookup. I want to find which of those hosts are returning events as well as which of them are not. In this case, those that are not sending events to Splunk probably have never sent events to Splunk. It's not like the non-senders sent events once but have stopped recently.

I see I can get a list of event-sending hosts with

index=_internal sourcetype=splunkd source=*splunkd.log 
    [| inputlookup my_hosts_lookup
    | fields host 
    | format ] 
| dedup host 
| sort host 
| table host

But if I want to figure out which of the hosts in that lookup have never sent anything, I'm at a loss. Certainly the full lookup table minus the event-sending hosts == the missing hosts, but I'm not sure how to do that either.

If I could do them all in a single result perhaps with an eval field indicating that it's sending or not, that would be great, but even a separate search that identify the non-senders would be great.

Thank you

0 Karma
1 Solution

Champion

Answer that was accepted as working, for ease of finding by others:

  | tstats count WHERE index=_internal sourcetype=splunkd BY host
  | search
   [| inputlookup my_hosts_lookup 
    | fields host 
    | format ]
  | eval host=lower(host)
  | eval found_in_logs="true"
  | append [| inputlookup my_hosts_lookup]
  | stats values(found_in_logs) AS found_in_logs BY host
  | fillnull value="false" found_in_logs

Original answer:

Another option, using tstats (and in general, running some form of stats as early as possible:

| tstats count WHERE index=_internal sourcetype=splunkd BY host
| eval found_in_logs="true"
| append [| inputlookup my_hosts_lookup]
| stats values(found_in_logs) AS found_in_logs BY host
| fillnull value="false" found_in_logs

View solution in original post

Champion

Answer that was accepted as working, for ease of finding by others:

  | tstats count WHERE index=_internal sourcetype=splunkd BY host
  | search
   [| inputlookup my_hosts_lookup 
    | fields host 
    | format ]
  | eval host=lower(host)
  | eval found_in_logs="true"
  | append [| inputlookup my_hosts_lookup]
  | stats values(found_in_logs) AS found_in_logs BY host
  | fillnull value="false" found_in_logs

Original answer:

Another option, using tstats (and in general, running some form of stats as early as possible:

| tstats count WHERE index=_internal sourcetype=splunkd BY host
| eval found_in_logs="true"
| append [| inputlookup my_hosts_lookup]
| stats values(found_in_logs) AS found_in_logs BY host
| fillnull value="false" found_in_logs

View solution in original post

Builder

Thank for your reply. As I noted in the comment above, I was looking to get only the status of the hosts in my lookup table which his only 375. This search returns 1190. We currently get results from 815 UF's. So... "bonus hosts" 🙂

As above, this approach is also education for me for future SPL's.

0 Karma

Let's give @micah's approach a scoping adjustment at the outset, and then I think it will work for you.

| tstats count WHERE index=_internal sourcetype=splunkd BY host
| search
 [| inputlookup my_hosts_lookup 
  | fields host 
  | format ]
| eval found_in_logs="true"
| append [| inputlookup my_hosts_lookup]
| stats values(found_in_logs) AS found_in_logs BY host
| fillnull value="false" found_in_logs
0 Karma

Builder

I ran this and got 583 results again. I looked at the individual search components until I realized what the issue was. Some hosts may report themselves to Splunk with a name that's in all caps rather than what I've got in the lookup.

So with just a little tweak to the above:

 | tstats count WHERE index=_internal sourcetype=splunkd BY host
 | search
  [| inputlookup my_hosts_lookup 
   | fields host 
   | format ]
 | eval host=lower(host)
 | eval found_in_logs="true"
 | append [| inputlookup my_hosts_lookup]
 | stats values(found_in_logs) AS found_in_logs BY host
 | fillnull value="false" found_in_logs

I get 375 results returned. Thanks very much for you help!

0 Karma

This should give you a table that displays "True" for hosts that were found in the source logs and "False" for hosts from the lookup that had no log entries in the timespan searched:

index=_internal sourcetype=splunkd source=*splunkd.log 
[ | inputlookup my_hosts_lookup
 | fields host 
 | format ]
| eval found_in_logs=1
| append
 [ | inputlookup my_hosts_lookup
    | fields host ] 
| stats max(found_in_logs) AS found_in_logs BY host 
| eval found_in_logs=if(found_in_logs=1, "True", "False")

Note: I realized this wasn't scoped appropriately, so I modified the code.

0 Karma

Builder

Thanks very much for your reply. My actual list of servers of my lookup is actually only 375. This search returns 583 results. I was really just looking for the status of those 375, not a larger portion of my overall hosts. 583 seems to be 208 (number of hosts that are reporting in) + 375 (total number of hosts from the lookup table).

But, I will say this search is education for me and helpful to understand new approaches with SPL.

0 Karma