I have a lookup table filled with thousands of user IDs. I have a log filled with tens of thousands of user IDs. I am trying to find the user IDs in the lookup table that are not in the log.
|inputlookup lookuptable.csv
|search NOT [index=... rest of search query here|stats count by userID|fields - count]
I use the stats count by userID
clause to essentially table
and dedup
the events to get only the userID, which I then feed back into the lookup table to find which events are not in the list.
The problem with this search is that the subsearch returns tens of thousands of results, so I end up hitting the maxout. I would need to increase the maxout to 100,000 to see all the results, which isn't possible to do. Is there a way I can get the same analysis without using a subsearch?
@brajaram
Can you please try this?
index=... rest of search query here
| stats count by userID
| fields - count
| eval flg1=1
| append
[| inputlookup lookuptable.csv | fields userID
| eval flg2=1 ]
| stats values(flg1) as flg1,values(flg2) as flg2 by userID
| where isnull(flg1) AND isnotnull(flg2)
Thanks
@brajaram
Can you please try this?
index=... rest of search query here
| stats count by userID
| fields - count
| eval flg1=1
| append
[| inputlookup lookuptable.csv | fields userID
| eval flg2=1 ]
| stats values(flg1) as flg1,values(flg2) as flg2 by userID
| where isnull(flg1) AND isnotnull(flg2)
Thanks
Works perfectly! Didn't know I could use append like that, thanks for the help.
Yeah, With the different data sets, Splunk commands combinations works mind blowing. like combination of append and stats commands are awesome.
Glad to help you..
!!! Happy Splunking !!!
One other question, is there a way I would be able to do this is if the lookup table is larger than the maxout?
Or more specifically, if I have two separate searches, like the following:
index=.... params=...
|stats count by userID
|fields - count
and
index=.... params=somethingelse
|stats count by userID
|fields - count
Both searches return more than 100,000 events, so if I want to track down where users were in search 1 but not in search 2, I still can't use the solution posted due to the maxout.
Yes, many command has it's own limit. You can find it in limits.conf.
http://docs.splunk.com/Documentation/Splunk/6.1/admin/Limitsconf
But in this case you have to execute search again and again to filter users upto when results count is not in such proper manner. Each results you have to store in another lookup, which will use in main search to filter those users which is already considered.