Splunk Search

Lookup from a search instead of CSV

Contributor

Hi All,

I have two data sources. One of them is a transient data which keeps changing. I have to use this search as a lookup source (instead of CSV) for another search.

Lookup data:-

_time ip_address username

Search

_time ip_address request_count location

I want to do lookup to match ip_address and _time and corelate the request_count and location info to a username. The ip_address for a particular username keep changing all the time. So at a given point in time if the _time and ip address matches then corelate the two data sets.

transaction comamnd is taking too much time.

Is there any other way to it ?

Thanks in advance.

Regards

KKN

Tags (1)
0 Karma

SplunkTrust
SplunkTrust

Along the same lines as @MuS suggestion, you could easily schedule a search that does a outputlookup from your "lookup search" into a CSV file and then use that lookup file as-is.

There are some good reasons to do this in order to have best performance. Remember, if you have more than one indexer each instance of Splunk is only going to have a fraction of the data local to it. You have no guarantees that the data needed to perform a lookup exists on the same indexer as the data you are wanting to do the lookup against... To be able to "use the results of a search as a lookup", you need to:

  1. Run the first search to "make the lookup"
  2. Distribute its results to all of the indexers
  3. Run the "main search" which uses the lookup

A subsearch can (often) do this, at (usually) a very high cost.

For what you've described so far, what you want is a time-based lookup. See http://docs.splunk.com/Documentation/Splunk/6.2.2/Knowledge/Addfieldsfromexternaldatasources#Set_up_... for more information, but the idea is that Splunk can take lookup files that have time as an attribute and automatically figure out the "right" record from the lookup to use with a given event based on the time in the event.

Conveniently, the three steps I mentioned above are almost precisely what happens when you run a scheduled search that uses outputlookup to make a lookup file. That lookup file is then made part of the search bundle and gets distributed out to all of the indexers with a local copy.

We do something like this today to correlate VPN IP addresses with the usernames. A search runs every few minutes that pulls back VPN gateway events mapping username to IP, and then we outputlookup that to a time-bounded lookup. Then, we use said time-bounded lookup against firewall logs to be able to see which VPN user made which outbound firewall connection. Splunk ties the pieces together for us quickly and easily, without any nasty transactions or subsearches.

SplunkTrust
SplunkTrust

You could use a saved search and a combination of inputcsv and outputcsv to update this lookup file with your search

Builder

@KarunK - Using a subsearch should be last resort, it's quite performance exhaustive.

Post your search and we'll see if there is anything we can do to help optimize it, maybe suggest better commands.

0 Karma

Contributor

Hi Mark,

Here is my search. I got stuck here. Both searches have to run on a window of 1hour each, for a period of a day. Means run 1 hour then corelate both data sources and then run for another hour, then co-relate again for a whole day. That way we can take care of the transient data.

index="radius_logs" [ search index="trasnsaction_logs" | table c_ip | rename c_ip as cus_ip ]

Thanks

Regards

KKN

0 Karma

Splunk Employee
Splunk Employee

subsearch http://docs.splunk.com/Documentation/Splunk/6.2.2/Search/Usesubsearchtocorrelateevents

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!