Hi All,
I have two data sources. One of them is a transient data which keeps changing. I have to use this search as a lookup source (instead of CSV) for another search.
Lookup data:-
_time ip_address username
Search
_time ip_address request_count location
I want to do lookup to match ip_address and _time and corelate the request_count and location info to a username. The ip_address for a particular username keep changing all the time. So at a given point in time if the _time and ip address matches then corelate the two data sets.
transaction comamnd is taking too much time.
Is there any other way to it ?
Thanks in advance.
Regards
KKN
Along the same lines as @MuS suggestion, you could easily schedule a search that does a outputlookup
from your "lookup search" into a CSV file and then use that lookup file as-is.
There are some good reasons to do this in order to have best performance. Remember, if you have more than one indexer each instance of Splunk is only going to have a fraction of the data local to it. You have no guarantees that the data needed to perform a lookup exists on the same indexer as the data you are wanting to do the lookup against... To be able to "use the results of a search as a lookup", you need to:
A subsearch can (often) do this, at (usually) a very high cost.
For what you've described so far, what you want is a time-based lookup. See http://docs.splunk.com/Documentation/Splunk/6.2.2/Knowledge/Addfieldsfromexternaldatasources#Set_up_... for more information, but the idea is that Splunk can take lookup files that have time as an attribute and automatically figure out the "right" record from the lookup to use with a given event based on the time in the event.
Conveniently, the three steps I mentioned above are almost precisely what happens when you run a scheduled search that uses outputlookup
to make a lookup file. That lookup file is then made part of the search bundle and gets distributed out to all of the indexers with a local copy.
We do something like this today to correlate VPN IP addresses with the usernames. A search runs every few minutes that pulls back VPN gateway events mapping username to IP, and then we outputlookup
that to a time-bounded lookup. Then, we use said time-bounded lookup against firewall logs to be able to see which VPN user made which outbound firewall connection. Splunk ties the pieces together for us quickly and easily, without any nasty transactions or subsearches.
You could use a saved search and a combination of inputcsv
and outputcsv
to update this lookup file with your search
@KarunK - Using a subsearch should be last resort, it's quite performance exhaustive.
Post your search and we'll see if there is anything we can do to help optimize it, maybe suggest better commands.
Hi Mark,
Here is my search. I got stuck here. Both searches have to run on a window of 1hour each, for a period of a day. Means run 1 hour then corelate both data sources and then run for another hour, then co-relate again for a whole day. That way we can take care of the transient data.
index="radius_logs" [ search index="trasnsaction_logs" | table c_ip | rename c_ip as cus_ip ]
Thanks
Regards
KKN
subsearch http://docs.splunk.com/Documentation/Splunk/6.2.2/Search/Usesubsearchtocorrelateevents