I have the following search to find the number of switches "Off" on a day (call it day=0), and then use a field lookup to search those switches on subsequent days and track when/how many turn on for each next day. There are ~150k switches that are "off" on day=0. (1) Therefore, my field lookup is getting stuck on the 10k limit - is there a way to get around this i.e. a different format for the search, rather than increasing my subsearch limit?
(2) Previously, I was specifying day=0 by the actual date e.g mm/dd/yy; however, I want to run this query on multiple datasets that were each have their first collection on different start dates, that I may not necessarily know, therefore I cannot specify the exact date; but would like some general format that will automatically search from day=0
Here is the query currently:
index=xyz source=abc macAddress!=def earliest=07/24/2016:00:00:00 latest=07/24/2016:23:59:59 operStatus="Off"
[ search source=*devices* index=xyz productFamily=pqr timeStamp="2016-07-24*"
| dedup deviceId
| table deviceId]
| dedup _time,macAddress
| fields macAddress
| fields - _*
| outputlookup listofswitchesdownonday0.csv
And then field lookup with this:
index=xyz earliest=07/25/2016:00:00:00
[| inputlookup "listofswitchesoffonday0.csv"]
| bin_time span=1d
| dedup _time, macAddress
| stats count by _time,operStatus
| sort _time
But this gives me the 10k limit, when the number I'm looking for is ~150k; and if I use the same query for other datasets, they may have a first collection date earlier than 07/24/2016. There is a lot of data, so the faster the better.
Thank you in advance
Hi ALevin123,
you can use inputlookup
and use the append=t
option http://docs.splunk.com/Documentation/Splunk/6.4.3/SearchReference/Inputlookup and run it without the subsearch - try this:
index=xyz earliest=07/25/2016:00:00:00
| inputlookup append=t "listofswitchesoffonday0.csv"
| bin_time span=1d
| dedup _time, macAddress
| stats count by _time,operStatus
| sort _time
Hope this helps ...
cheers, MuS
Try a lookup
approach...
*Query to generate lookup file*
index=xyz source=abc macAddress!=def earliest=07/24/2016:00:00:00 latest=07/24/2016:23:59:59 operStatus="Off"
[ search source=devices index=xyz productFamily=pqr timeStamp="2016-07-24*"
| dedup deviceId
| table deviceId]
| stats count by macAddress
| outputlookup listofswitchesdownonday0.csv
*And then field lookup with this*
index=xyz earliest=07/25/2016:00:00:00
| lookup "listofswitchesoffonday0.csv" macAddress OUTPUT count
| where isnotnull(count)
| bin_time span=1d
| dedup _time, macAddress
| stats count by _time,operStatus
| sort _time
Thank you,
Could you please help me address part (2) how to keep the dates general to apply to any data set?
i.e. rather than specifying earliest=7/25/2016 (I may not know when "earliest" is), I want to generate my lookup file from the date of the first data collection / available data and do the field lookup on the next day.
The search is still a little slow, looks like it might take over 1hr to search ~4 months of data searching on fast mode...is this to be expected or there is a way to speed it up?
To speed it up, you may want to look at Summary Indexing.
http://docs.splunk.com/Documentation/Splunk/6.4.3/Knowledge/Usesummaryindexing
As for the start date, if you want to start with the first event logged in your index, try this
index=xyz source=abc macAddress!=def operStatus="Off" [| metadata type=hosts index=ans | stats max(lastTime), min(firstTime) as earliest | eval latest=earliest=86400 | table earliest latest | format ]
[ search source=devices index=xyz productFamily=pqr timeStamp="2016-07-24*"
| dedup deviceId
| table deviceId]
| stats count by macAddress
| outputlookup listofswitchesdownonday0.csv
AND
index=xyz [| metadata type=hosts index=ans | stats min(firstTime) as earliest | eval earliest=earliest=86400 | table earliest]
| lookup "listofswitchesoffonday0.csv" macAddress OUTPUT count
| where isnotnull(count)
| bin_time span=1d
| dedup _time, macAddress
| stats count by _time,operStatus
| sort _time
Thank you - I'm not getting results, yet - can you please help me troubleshoot:
do you mean | eval latest = earliest +86400
And, what do I do about that timeStamp="2016-07-24*"
in the subsearch of the query to generate the lookup file?
eval latest=earliest+86400
. For timeStamp, you could do something like this
[| metadata type=hosts index=ans | stats min(firstTime) as earliest | eval search=strftime(earliest, "%Y-%m-%d")."*" | table search]
Still can't get this to work... even without fixing the timeStamp="2016-07-24"
, I get an error coming from here
[| metadata type=hosts index=ans | stats max(lastTime), min(firstTime) as earliest | eval latest=earliest + 86400 | table earliest latest | format ]
Change the index name to xyz
.
[| metadata type=hosts index=xyz | stats max(lastTime), min(firstTime) as earliest | eval latest=earliest + 86400 | table earliest latest | format ]
Gives me a "no results found"
Could the metadata subsearches be an issue if they cap out at a 10k subsearch limit?