Splunk Search

SPL Question: Using Lookup field values in a tstats search

IAskALotOfQs
Path Finder

Hi all, getting to grips with SPL and would be forever grateful if someone could lend their brain for the below:

 

I've got the lookup in the format below:

(Fields) -->  host, os, os version

-----------------------------------------

(Values) ---> Server01, Windows, Windows Server 2019

 

But in my case, this lookup has 3000 field values, I want to know their source values in Splunk (This lookup was generated by a match condition with another, so I KNOW that these hosts are present in my Splunk env)

 

I basically need a way to do the following:

 

"| tstats values(sources) where index=* host=(WHATEVER IS IN MY LOOKUP HOST FIELD) by index, host"

 

But i can't seem to find a way, I did try to originally match the below:

 

| tstats values(source) where index=* by host, index

| join type=inner host

| [|inputlookup mylookup.csv | fields host | dedup host]

 

But my results were too large to handle by Splunk, plz help

Labels (4)
Tags (1)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

Forget join - that is not a splunk way of doing things.

Use either a subsearch or a lookup - they may perform differently depending on data volumes, but you can do this

Subsearch method

| tstats values(sources) where index=* 
  [ | inputlookup mylookjup.csv | fields host | dedup host ]
 by index, host

The subsearch will effectively return with ( host=x OR host=y OR host=z...) which is then used in the outer search.

Lookup method

| tstats values(sources) where index=* by index, host
| lookup mylookjup.csv host

This gets ALL the data from the indexes and then does the lookup to get the OS details.

You can always do

| where isnull(os) 

which will then show those hosts that do not exist in the lookup that are found in the data.

Note that the lookup CSV will be case sensitive - if you want to make it insensitive, make a lookup definition and configure it as case insensitive

IAskALotOfQs
Path Finder

Appreciate your speedy reply!

 

I've just checked to see how by "subsearch" would run, I've changed my time picker to 30 mins and it hasn't run anything, its been stuck at 0 of 0 events matched with bar flashing....

 

Also, the second way "lookup" uses the same concept of my previous search, I will most likely run into a "VV data is too large for serialization" error

0 Karma

bowesmana
SplunkTrust
SplunkTrust

The reason why your subsearch is taking a long time is _probably_ due to the volume of hosts, because using a large X=Y OR A=B OR C=D expression in the search can be very slow to get parsed and setup, hence the lookup option can often be the better option.

The second way is fundamentally on a different planet to your concept of the previous search. Using join in itself is limited and using join+inputlookup is a completely wrong way to use lookups. The lookup command is designed to enrich data with results from a lookup. If a result cannot be found in the lookup, you will not get results from the lookup and you can validate that state.

Have you tried it?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

I've just checked to see how by "subsearch" would run, I've changed my time picker to 30 mins and it hasn't run anything, its been stuck at 0 of 0 events matched with

The question is: have you manually verified that your data in the new time period actually contain matching hosts?  Here is a quick way to confirm that the subsearch method works.

1. Select a couple hosts that you know some events match in this time period match.  Say, host1 and host2.  Run this search with the chosen time picker.

 

| tstats values(source)
  where index=* (host="host1" OR host="host2")
  earliest=-30m@h latest=-0@h
  by host, index

 

If this search gives you no output, you need to find another couple of hosts till you can get non-zero output. (Hint: it is best to run all tests with a time boundaries where you know your tests will not cross.  I would suggesting using fixed earliest/latest rather than time picker.  For example, earliest=-30m@h latest=-0m@h)

2. Make sure that host1 and host2 exist in mylookup.csv with this search

 

| inputookup mylookup.csv where host IN (host1, host2)
| fields host
| dedup host
| format

 

As @bowesmana explained, the output should be something like

search
( ( host="host1" ) OR ( host="host2" ) )

If your result is different, that means mylookup.csv does not host1 and host2. You then need to redesign/repopulate your lookup table.

3. Run the following combined search

 

| tstats values(source)
  where index=*
    [inputookup mylookup.csv where host IN (host1, host2)
    | fields host
    | dedup host]
  earliest=-30m@h latest=-0@h
  by host, index

 

This search should give you the exact same results as the first one.

Then, you just remove there where clause in the subsearch.


Also, the second way "lookup" uses the same concept of my previous search, I will most likely run into a "VV data is too large for serialization" error


Why do you say @bowesmana's lookup method uses the same concept as your join method?  Have you tried it?  It is totally different because it doesn't involve join.  Are you suggesting that | tstats values(source) where index=* by host, index always gives you that error? What about | tstats values(source) where index=* by host? What about | tstats values(source) where index=*?  If these searches give you error, you may have some fundamental problem in your indexer.  No amount of SPL can save the day.

Given the chance, however, I would use the subsearch method because it is the fastest.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...