Splunk Search

How to overcome subsearch 10k limit with field lookup & set date for first data collection without knowing actual date?

ALevin123
New Member

I have the following search to find the number of switches "Off" on a day (call it day=0), and then use a field lookup to search those switches on subsequent days and track when/how many turn on for each next day. There are ~150k switches that are "off" on day=0. (1) Therefore, my field lookup is getting stuck on the 10k limit - is there a way to get around this i.e. a different format for the search, rather than increasing my subsearch limit?

(2) Previously, I was specifying day=0 by the actual date e.g mm/dd/yy; however, I want to run this query on multiple datasets that were each have their first collection on different start dates, that I may not necessarily know, therefore I cannot specify the exact date; but would like some general format that will automatically search from day=0

Here is the query currently:

index=xyz source=abc macAddress!=def earliest=07/24/2016:00:00:00 latest=07/24/2016:23:59:59 operStatus="Off" 
  [ search source=*devices* index=xyz productFamily=pqr timeStamp="2016-07-24*" 
  | dedup deviceId
  | table deviceId]
| dedup _time,macAddress
| fields macAddress
| fields - _*
| outputlookup listofswitchesdownonday0.csv

And then field lookup with this:

index=xyz earliest=07/25/2016:00:00:00
[| inputlookup "listofswitchesoffonday0.csv"]
| bin_time span=1d
| dedup _time, macAddress
| stats count by _time,operStatus 
| sort _time

But this gives me the 10k limit, when the number I'm looking for is ~150k; and if I use the same query for other datasets, they may have a first collection date earlier than 07/24/2016. There is a lot of data, so the faster the better.

Thank you in advance

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi ALevin123,

you can use inputlookup and use the append=t option http://docs.splunk.com/Documentation/Splunk/6.4.3/SearchReference/Inputlookup and run it without the subsearch - try this:

index=xyz earliest=07/25/2016:00:00:00
 | inputlookup append=t "listofswitchesoffonday0.csv"
 | bin_time span=1d
 | dedup _time, macAddress
 | stats count by _time,operStatus 
 | sort _time

Hope this helps ...

cheers, MuS

0 Karma

sundareshr
Legend

Try a lookup approach...

*Query to generate lookup file*

index=xyz source=abc macAddress!=def earliest=07/24/2016:00:00:00 latest=07/24/2016:23:59:59 operStatus="Off" 
[ search source=devices index=xyz productFamily=pqr timeStamp="2016-07-24*" 
| dedup deviceId
| table deviceId]
| stats count by macAddress
| outputlookup listofswitchesdownonday0.csv

*And then field lookup with this*

index=xyz earliest=07/25/2016:00:00:00
| lookup "listofswitchesoffonday0.csv" macAddress OUTPUT count
| where isnotnull(count)
| bin_time span=1d
| dedup _time, macAddress
| stats count by _time,operStatus 
| sort _time
0 Karma

ALevin123
New Member

Thank you,

Could you please help me address part (2) how to keep the dates general to apply to any data set?
i.e. rather than specifying earliest=7/25/2016 (I may not know when "earliest" is), I want to generate my lookup file from the date of the first data collection / available data and do the field lookup on the next day.

0 Karma

ALevin123
New Member

The search is still a little slow, looks like it might take over 1hr to search ~4 months of data searching on fast mode...is this to be expected or there is a way to speed it up?

0 Karma

sundareshr
Legend

To speed it up, you may want to look at Summary Indexing.

http://docs.splunk.com/Documentation/Splunk/6.4.3/Knowledge/Usesummaryindexing

As for the start date, if you want to start with the first event logged in your index, try this

index=xyz source=abc macAddress!=def  operStatus="Off" [| metadata type=hosts index=ans | stats max(lastTime), min(firstTime) as earliest | eval latest=earliest=86400 | table earliest latest | format ]
 [ search source=devices index=xyz productFamily=pqr timeStamp="2016-07-24*" 
 | dedup deviceId
 | table deviceId]
 | stats count by macAddress
 | outputlookup listofswitchesdownonday0.csv

AND

 index=xyz [| metadata type=hosts index=ans | stats min(firstTime) as earliest | eval earliest=earliest=86400 | table earliest]
 | lookup "listofswitchesoffonday0.csv" macAddress OUTPUT count
 | where isnotnull(count)
 | bin_time span=1d
 | dedup _time, macAddress
 | stats count by _time,operStatus 
 | sort _time
0 Karma

ALevin123
New Member

Thank you - I'm not getting results, yet - can you please help me troubleshoot:

do you mean | eval latest = earliest +86400

And, what do I do about that timeStamp="2016-07-24*" in the subsearch of the query to generate the lookup file?

0 Karma

sundareshr
Legend

eval latest=earliest+86400. For timeStamp, you could do something like this

[| metadata type=hosts index=ans | stats min(firstTime) as earliest | eval search=strftime(earliest, "%Y-%m-%d")."*" | table search]
0 Karma

ALevin123
New Member

Still can't get this to work... even without fixing the timeStamp="2016-07-24", I get an error coming from here

[| metadata type=hosts index=ans | stats max(lastTime), min(firstTime) as earliest | eval latest=earliest + 86400 | table earliest latest | format ]
0 Karma

sundareshr
Legend

Change the index name to xyz.

[| metadata type=hosts index=xyz | stats max(lastTime), min(firstTime) as earliest | eval latest=earliest + 86400 | table earliest latest | format ]
0 Karma

ALevin123
New Member

Gives me a "no results found"

Could the metadata subsearches be an issue if they cap out at a 10k subsearch limit?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...