Splunk Search

results are being truncated in join query

manishyadav91
New Member

Problem:

i have 200000 splunk events from which i only want 15000 events ( like vlookup in excel)

Splunk events contain(200000 hosts ) : host version kernel
lookuptable contain (15000 hosts): host window
i only want result for the host listed in lookup table(15000)

i tried using join but it truncates the result .

Query : | inputlookup "host.csv" | rename HOST as host | join host [search index=server_info platform=redhat message="SYSINFO*" host="*" ] | table host version kernel window | where window != " "

0 Karma

gcusello
Esteemed Legend

Hi manishyadav91,
In subsearches you have the limit of 50,000 results so you must put the search as first.
Then to use the join command with many events isn't a good idea becaus your search will be very slow.
So you could change your search like the following:

index=server_info platform=redhat message="SYSINFO*" host="*"  [ | inputlookup "host.csv" | rename HOST as host | fields host ]
| lookup host.csv HOST AS host OUTPUT version kernel window 
| table host version kernel window

Bye.
Giuseppe

manishyadav91
New Member

Hi Giuseppe,

Search 1 : index=server_info platform=redhat message="SYSINFO*" host="*"

| dedup host | table host Window kernel version
results 29467

Search 2 : | inputlookup Host-Q3.csv | table host Window
Result: 15679

Search combined : index=cba_chef platform=redhat message="SYSINFO*" host="*"

| lookup Host-Q3.csv host OUTPUT Window | where Window != " " | dedup host |
table host Window kernel version
results 3599

i dont know why the results are not complete, ideally combined searc should give 15000 events but it doesn't , i have tried all the solutions listed below but same results .

is there any other way to to search only limited events/host from the whole load of events.

0 Karma

starcher
SplunkTrust
SplunkTrust

Do not inputlookup and join. Do the search and use lookup as a lookup. You are breaking map reduce in several ways plus hitting limits.

SearchHere | lookup host.csv HOST as host OUTPUT HOST as isFound | where isnotnull(isFound) | stats count by host version kernel window | fields - count

0 Karma

HiroshiSatoh
Champion

The default for subsearch is as follows.
I think that it is good to use the LOOKUP command by setting the LOOKUP definition.

https://docs.splunk.com/Documentation/Splunk/7.3.1/Search/Aboutsubsearches

[subsearch]
maxout = <integer>

Maximum number of results to return from a subsearch.
This value cannot be greater than or equal to 10500.
Defaults to 10000.
maxtime = <integer>

Maximum number of seconds to run a subsearch before finalizing
Defaults to 60.
ttl = <integer>

Time to cache a given subsearch's results, in seconds.
Do not set this below 120 seconds.
Defaults to 300.
0 Karma

manishyadav91
New Member

yes , the default maxout is 50000 but my search output is 2450000. so i get the first 50000 which includes non relevant events.

0 Karma

HiroshiSatoh
Champion

Is there a problem with using the lookup command?

 index=server_info platform=redhat message="SYSINFO*" host="*"
|lookup host.csv host OUTPUT XXX,XXX,XXX
0 Karma

manishyadav91
New Member

index=server_info platform=redhat message="SYSINFO*" host="*" |lookup host.csv host OUTPUT Window | table host Window kernel version |where Window != " "

i tried this too, it only give partial results 4400 instead of 15000 host in csv lookup table.

0 Karma

HiroshiSatoh
Champion

Please double-check the events that have not been set up and the contents of the lookup.

Or ...
LOOKUP does not exclude events. In the case of a distributed configuration, the LOOKUP table may not be synchronized on the indexer side.

Try setting "local=true"

Optional arguments
local
Syntax: local=
Description: If local=true, forces the lookup to run on the search head and not on any remote peers.
Default: false

0 Karma

manishyadav91
New Member

Can this be done in any other way ?

0 Karma

HiroshiSatoh
Champion

It will synchronize after a while. See the following blog for commands.

https://www.splunk.com/blog/2017/06/08/syncing-lookups-using-pure-spl.html

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...