Splunk Search

What is the best way to correlate one result set with another without using inputlookups/ subsearches?

SplunkTrust
SplunkTrust

I'm trying to find the best way to join the results of one search, and essentially feed that result set to match with another.

Overall I want to correlate a list of invoices for a particular day. I am mostly able to do it with one search based on the logic of the systems and how things are logged. Another good-to-know is one file is being indexed manually while logging is being created (sourcetype=oms_invoice below). I can just do something like this:

(index=gentran sourcetype=gentran_audit InvoiceNumber=* NOT InvoiceNumber=00 source="/log/file/path/wholeLog.20150610")  OR  (index=gentran sourcetype=oms_invoice InvoiceNumber=*) earliest=-9d@d latest=-7d@d

Then I use stats to group stuff which allows us to bypass the 10,000 threshold limit:

|stats count as event_count, 
    earliest(_time) AS entered, 
    latest(_time) as exited, 
    values(SourceSystem) as SourceSystemValues 
    latest(status) as last_seen_status
    latest(other) as last_seen_status_message
by InvoiceNumber

For this result I get 45,000 invoices went from one system to the next. Now I want to take this output and try to match with what we have in another environment. The problem here is matching by date will not work, as the time an invoice made it to that environment. may be different IE the following day. Without any unique identifier being logged in the first search above, I'm finding it difficult to tie out the invoice number list to show the complete correlation between the first two hops and the last. The main goal is to take the first result set and macth the same set of invoice numbers with what is in the last hop.

Is there anyway I can take the first result set and match that invoice list with what is in that last hop? I'm thinking similarly on the lines of using inputlookup. The problem with that is the 10,000 limit of sub search, so I'd like to find a way to get around that.

Any ideas and insight are much appreciated!

Tags (3)
0 Karma

I don't know whether this is the best way as it has some limitation...
You can create a custom command and implement your logic to add/modify fields in you result.... This can be used not only to correlate your result with an another search's result...But also for the data from a external system....
i.e.. You can add or modify fields to the existing search results based on its value....

By Default You can process only 50000 results... see command.conf docs for more details....
Splunk have good doc to create custom command ...

http://dev.splunk.com/view/python-sdk/SP-CAAAEU2
http://docs.splunk.com/Documentation/Splunk/6.2.3/Search/Writeasearchcommand

0 Karma

Explorer

Your first sentence suggest you are looking for a simple subsearch.
filter [search mysubsearch]|stats mystats
the result of the subsearch will be filtering the main search.

You mentioned join: see also appendpipe in the doc. It might suits your need

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!